New Video Series: The 7 Challenges of Big Data Analytics with Thomas Hazel... Watch Now ->
Start Free Trial

ChaosSearch Blog

7 MIN READ

The Basics of Using AWS EventBridge for Observability

As you adopt modern, serverless, microservices-based architectures, it can become more challenging to monitor and understand the state of your applications at any given time. That’s where event bus capabilities from services like Amazon EventBridge can come in handy. AWS EventBridge can help you build loosely coupled, event-driven architectures and applications, and deploy new features faster. You can also leverage AWS EventBridge for observability purposes, collecting and sending data across your applications and services to a variety of external applications for log analytics, security operations, and more use cases.

 

AWS EventBridge Observability

 

In this article, we’ll explore some of the basics of using AWS Eventbridge for observability, including how it integrates with other services such as AWS CloudWatch and ChaosSearch.

 

Event-driven architectures and why they’re important

System events are critical telemetry for cloud-native, microservices-based environments. An event represents any change of state or update, such as a customer putting an item in the shopping cart of a retailer’s website. If you react well to system events, you can prevent issues from happening within your applications and infrastructure. For example, you can avoid provisioning too many resources, which can drive up the cost of your cloud bill. Or you can troubleshoot persistent issues based on analyzing your logs and system events.

Event-driven architectures are common within serverless and microservices-based environments. They’re important because they help you improve agility by allowing you to act quickly on events, especially if your system uses decoupled components.. You can use services like Amazon EventBridge to share information from microservices-based applications you build with other SaaS apps and external services, while keeping these various components isolated from one another.

Sharing this information is important for observability. Even though microservices are isolated by nature, the telemetry data contained within them can be correlated with other data sources to understand your system with more accuracy and depth. This is particularly important for log analytics across a broad range of observability use cases ranging from operational to cybersecurity log data and beyond.

Download the eBook: BEYOND OBSERVABILITY - The Hidden Value of Log Analytics

 

Creating an event-driven observability system

As you can imagine, a serverless environment generates a high volume of event data. You can use EventBridge as an event bus to send data to a central location for analysis. Some people choose to use EventBridge with services like CloudWatch to centralize monitoring within a single AWS account. However, since most serverless environments generate such a high volume of logs, it can quickly become costly to analyze log data in CloudWatch.

To reduce costs and increase efficiency, many organizations choose a best-of-breed observability approach. That allows you to use different external, API-connected services based on their strengths, and can dramatically reduce costs vs. using a single tool for all of your event monitoring and observability needs.

For example, you may choose to use AWS EventBridge to set up CloudWatch alarms for anomalies, by centralizing EC2 events into a single repository. You can create a rule in EventBridge to trigger an AWS CloudWatch alarm, or even set up a custom API to generate a service ticket when something goes wrong. These are just a few ways to use AWS EventBridge.

For deeper root cause analysis and troubleshooting use cases, CloudWatch users may face some limitations, though. This is especially true if you operate in a multi-cloud environment, and need to correlate data sources outside AWS. The good news is that in a best-of-breed observability environment, you don’t need to rely on a single tool. You can complement it with others that are stronger for discovering patterns across larger datasets or multiple data sources.

 

Why data retention is important for event-driven troubleshooting

Data retention is one of the biggest challenges of root cause analysis in a microservices-based, event-driven architecture. Many tools only retain data for 30 days (or even less) based on their minimum compliance requirements. If an issue persists for longer (as many do) it becomes difficult and costly to rehydrate archived data, or retain data within certain observability tools for longer periods of time. As mentioned above, event-driven architectures generate vast amounts of logs, which requires you to retain more data for longer.

Retaining log data can empower your team to identify advanced persistent security threats or long-term user trends that may adversely impact your customer experience. Longer data retention windows provide deeper access to retrospective log data and better support for a variety of log analytics use cases. That’s why tools like ChaosSearch that offer unlimited data retention (via low-cost cloud object storage like Amazon S3) can be a great complement to existing AWS services like EventBridge and CloudWatch.

Read the Case Study: ChaosSearch Creates a Streamlined Centralized Logging Solution for Sixth Street Partners

 

Using ChaosSearch with Amazon S3 and EventBridge

The ChaosSearch data lake platform minimizes data storage costs and eliminates the need for data movement, optimizing costs and allowing you to scale data analytics. It works by reading, indexing and analyzing data directly in Amazon S3 buckets, without data movement or duplication.

Cloud object storage, like Amazon S3 and Google Cloud Storage (GCS), offers near-infinite scalability, is secure, global, and designed for (11 9’s) durability. It also happens to be the most common destination for cloud-native application logs, and the default storage for cloud infrastructure and services logs.

In our next EventBridge post, we’ll offer a deeper tutorial on how to use this popular event bus alongside ChaosSearch and S3 for a wide variety of use cases. Those could include allowing a fraud detection monitoring system to notify other systems, updating SaaS product usage metrics, creating a service ticket to deal with a system outage, actioning abandoned shopping carts on an ecommerce platform, and much more.

Want to learn more about how ChaosSearch and Amazon are better together?

Visit our AWS Partnership page.

 

Additional Resources

Read the Blog: Integrating Observability into Your Security Data Lake Workflows

Listen to the Podcast: Making the World's AWS Bills Less Daunting

Read the Blog: Understanding the Three Pillars of Observability: Logs, Metrics and Traces

About the Author, Dave Armlin

Dave Armlin
FOLLOW ME ON:
Dave Armlin is the VP Customer Success of ChaosSearch. In this role, he works closely with new customers to ensure successful deployments, as well as with established customers to help streamline integrating new workloads into the ChaosSearch platform. Dave has extensive experience in big data and customer success from prior roles at Hubspot, Deep Information Sciences, Verizon, and more. Dave loves technology and balances his addiction to coffee with quality time with his wife, daughter, and son as they attack whatever sport is in season. He holds a Bachelor of Science in Computer Science from Northeastern University. More posts by Dave Armlin