3 Pros and Cons of Amazon CloudWatch
Is your organization currently relying on Amazon CloudWatch for log management and log analytics in the cloud?
While CloudWatch delivers on many promises for AWS infrastructure monitoring, it isn’t the only log analytics solution – and may not even be your best option. Fast-growing organizations should consider supplementing CloudWatch with innovative alternatives offering better performance at scale, superior cost economics, reduced complexity and enhanced data access in the cloud.
In this blog post, we explore the most important pros and cons of leveraging CloudWatch for log analytics. We’ll highlight the key features and benefits that have driven CloudWatch adoption, along with the critical drawbacks that lead organizations to choose supplemental log management and log analytics solutions.
What is CloudWatch
If you’re an AWS user, you’re probably familiar with CloudWatch, and you may even be using it as a default observability option.
Amazon CloudWatch is a monitoring and observability solution for DevOps engineers, developers, site reliability engineers (SREs), and others interested in gathering data on applications and infrastructure. CloudWatch collects logs, metrics and events, touting a unified view of the operational health of AWS resources, applications and services running both on-premises and in the cloud.
With CloudWatch, you can set alarms and rules to detect anomalies, and visualize your logs. However, some user interface and scalability issues can hold users back from leveraging CloudWatch for troubleshooting use cases.
How to Use CloudWatch
Typically, users tap CloudWatch to collect monitoring and operational data and visualize logs and metrics through dashboards. With CloudWatch, you can gain insights into how your end users experience your applications, and make changes to improve performance when needed. Many users create alarms based on anomalous metric behavior. Some alarms can trigger specific actions as a result, such as auto-scaling in AWS.
Common use cases for CloudWatch include monitoring metrics and logs to:
- Understand and resolve application issues
- Optimize AWS resources, including container resources
- Test application features
- And more.
Now that we’ve reviewed the basic ways to use Amazon CloudWatch, let’s turn our attention to the pros and cons of using CloudWatch as your log analytics solution.
Read: CloudWatch Logs to S3: The Easy Way
CloudWatch Pros and Cons
1. It’s an AWS-Native Service
Most people use CloudWatch because it’s already a part of the Amazon ecosystem. That makes it easier to get up and running with CloudWatch. As of March 2022, Amazon provides a one-click setup option for CloudWatch, making it much easier to monitor your resources and workloads. People like the option of using an AWS-native service for infrastructure monitoring, and proactively allocating AWS resources like autoscaling as needed.
2. Easy to Set Up Alarms and Rules
Using the new one-click option makes it easier to set up alarms and rules, as well. This one-click feature launches CloudWatch Application Insights. Application Insights automatically discover the underlying resources in your account and set up alarms to monitor their health. Application Insights also make it simpler to create dashboards that display alerts and problems.
3. Collect Data from AWS or On-Premises Servers
CloudWatch offers users the ability to monitor metrics and logs from AWS resources, applications and services, as well as on-premises servers. This capability allows you to gain visibility into your entire system. While this high-level observability is great, the bigger challenges come in when users attempt to troubleshoot specific issues, or drill down into the root cause of persistent performance problems.
1. Complex User Interface
When it comes to troubleshooting and root cause analysis, the user interface for CloudWatch can become confusing and overly complicated. The reason why is related to the fact that CloudWatch is a set of tools, and you need to jump from one tool to the other to find what you’re looking for. We’ll talk about scale more later, but once you amass a high enough volume of logs, filtering and searching in the CloudWatch interface becomes far too complex.
As Duckbill Group analyst Corey Quinn put it in his scathing critique of CloudWatch (which he’s since walked back significantly), diagnosing problems in CloudWatch can be a major pain. He offers this example of diagnosing an error within an AWS Lambda function:
- Find the fact that it encountered an error in the first place by looking at the invocation error CloudWatch dashboard. I also could set up a filter to run a continuous query on the logs and alert when something shows up, except that isn't natively supported—I need a third-party tool for that (such as PagerDuty).
- Go diving into a variety of CloudWatch log groups and find the one named after the specific erroring function.
- Scroll manually through the many, many, many pages of log groups to find the specific invocation that threw an error.
- Realize that the JSON object that's retained isn't enough to troubleshoot with, cry in despair, and go write an article just like this one.
- Do some quick math and realize I'm paying an uncomfortable percentage of my AWS bill for a service that's only of somewhat marginal utility at best.
There’s got to be a better way…
2. Difficult Pricing Predictability
Perhaps one of the most confusing things about CloudWatch is pricing. There are separate charges for queries and events. Pricing is heavily dependent on the metrics and dashboards you choose to use. The number of alarms and custom events you set up. The volume of logs you save and the windows for which you choose to retain them. Pricing is not only high, but estimating costs is too complex. Imagine trying to estimate every single service, considering you’re going to use CloudWatch most of the time to investigate events (e.g. errors, traffic spikes, security issues, etc.). These events are impossible to predict accurately.
In short, it’s never clear how much you’ll pay in any given month, and pricing may vary dramatically depending on what you’re trying to achieve.
3. Usability at Scale
Querying and retaining logs at scale becomes difficult from a management perspective. Once you reach terabyte scale with your logs (and wish to retain them beyond a short period of time such as a few days or a week), CloudWatch simply becomes too costly and difficult to use when it comes to diagnosing issues or identifying advanced persistent security threats. With that said, CloudWatch remains a great tool for monitoring your metrics.
Watch: AWS Monitoring Challenges: Avoiding a Rube Goldberg Approach to AWS Management
Want More from Your Log Analytics Solution? Complement CloudWatch with ChaosSearch
Organizations are turning to ChaosSearch for powerful log analytics in the cloud with unlimited data retention – without some of the complexity and cost unpredictability issues that come with CloudWatch.
ChaosSearch offers a unique approach to indexing data in the cloud, one that deploys directly on cloud object storage like AWS S3. In fact, CloudWatch users that are already sending data to S3 will find ChaosSearch adoption incredibly simple. The result is a SaaS-based log analytics solution that’s simple to deploy, scales linearly into massive data volumes and enables you to query logs without needing to create data pipelines.
Read the Blog: 3 Straightforward Pros and Cons of Datadog for Log Analytics
Listen to the Podcast: Making the World's AWS Bills Less Daunting
Check out the Whitepaper: The New World of Data Lakes, Data Warehouses and Cloud Data Platforms