How to Unlock Faster Analytics with Amazon S3 Express One Zone
Recently at re:Invent, Amazon unveiled S3 Express One Zone for AWS. Express Zone for S3 responds to the demand for faster analytical query speeds, with the convenience of centrally storing all of your application telemetry data in cloud object storage. In the past, for data-intensive applications, data access speeds were slower than desired.
Let’s learn more about how this Amazon S3 Express One Zone announcement impacts data access speeds, and what that means for modern, microservices-based applications.
What is Express One Zone for S3?
Amazon S3 Express One Zone Storage Class is a high-performance storage zone for S3 cloud object storage. If you are using S3 as a data lake, or as a part of an embedded database within your application, query performance matters. Express Zone for S3 delivers single-digit millisecond latency for analytical queries in SQL or generative AI, making it easier to get lightning-fast answers to your queries.
S3 Express One Zone can improve data access speeds by 10x and reduce request costs by 50% compared to S3 Standard. With faster data access, more efficient use of compute resources, and lower API request costs, you can analyze frequently accessed datasets at a lower overall total cost of ownership (TCO).
Why is near real-time cloud object storage important?
This new offering from Amazon was built for data-intensive applications that require fast runtimes. While Amazon S3 Express One Zone can manage storage objects of any size, it works best with large amounts of small files. This is great for use cases like analyzing application telemetry data at scale.
As cloud-native applications are decomposed into microservices and deployed across dynamic and scalable cloud environments, the ability to gain insights into the system's behavior becomes increasingly challenging. That’s because the continuous growth in application usage has led to an exponential increase in the volume of telemetry data — including the four pillars of logs, metrics, events, and traces.
In the past, managing this explosion of data using S3 as a data lake may have led to slower query performance speeds. However, with S3 Express One Zone, this query performance improves. Combined with an analytics platform like ChaosSearch, teams experience 60% faster queries and substantial cost savings through lower query costs—without any code modifications.
This is important for many developers that rely on analyzing telemetry data to make regular changes to their applications, often in the interest of improving customer experience. The overall goal is to allow DevOps teams to elevate operational telemetry to business-level insights. Here are a few examples:
- Cloud Imperium Games uses telemetry data to not only fix bugs and performance issues, but also understand in-game player behavior. Using this analysis, they make changes to the game design and mechanics, making play more balanced and challenging. Ongoing changes and improvements bring new players to the game, and keep loyal players coming back for more.
- A fintech company is using a similar event analytics strategy to understand which events are application- and transaction-related. The DevOps team records events issued by Lambdas to find out the number of transactions per merchant and the number of orders across merchants for customers.
Advantages of using Amazon S3 for analytics
There are many advantages to using Amazon S3 for analytics, in combination with a database like ChaosSearch. As noted above, Amazon S3 Express One Zone is good at managing large amounts of small files (like cloud application telemetry data), while ensuring fast query performance. Plus, using S3 for analytics can provide major cost advantages over traditional observability services for analytics (think Splunk or Datadog). That’s because you can retain more data for analysis within cloud object storage, at much lower cost. Here are some other S3 advantages to note.
- Scalability of Amazon S3: Amazon S3 offers scalable storage for various data types. Its cost-effective, pay-as-you-use model allows for gradual expansion, backed by AWS's extensive data center network for virtually limitless storage. Moreover, S3's capability to handle thousands of data requests per second per bucket prefix ensures efficient scalability. Techniques like partitioning enhance the scalability of read/write operations.
- Robustness and Reliability of Data Storage: Amazon S3 provides robust data protection by redundantly storing objects across multiple AWS data centers, each with comprehensive backup systems. The use of Availability Zones (AZs) ensures data survival even in the case of a complete facility outage, offering zero downtime. With a durability rate of 99.999999999%, the likelihood of data loss is extremely low, almost eliminating such incidents. Additional features like versioning and cross-region replication further fortify data safety.
- Universal Data Access: AWS cloud storage allows global data access on the web. S3 buckets, secure by default, can also be configured for public access. The Amazon S3 API facilitates programmatic data access, enabling developers to integrate S3 functionalities into their applications.
- Economical Storage Solutions: Shifting to cloud-based storage reduces capital costs and drives innovation, with AWS offering the most cost-efficient solution for enterprise data. S3 Storage Classes cater to diverse data storage and accessibility needs, ranging from S3 Standard or S3 Express One Zone for frequently accessed data to S3 Glacier Deep Archive for long-term storage, optimizing cost efficiency.
- Data Security in Amazon S3: Amazon S3 excels in data security, offering multiple control mechanisms. By default, it blocks public access to new buckets and encrypts all data upon upload. Administrators can manage access through tools like IAM, ACL, S3 Access Points, and S3 Object Ownership. Additional security measures include continuous monitoring for threats with Amazon GuardDuty and compliance support through S3 Object Lock and data integrity verification using checksum algorithms.
The power of Amazon S3 Express One Zone + ChaosSearch LakeDB
For performance-intensive applications, S3 Express One Zone enables you to scale efficiently to process millions of requests per minute, without any pre-provisioning or modifying existing applications, as well as use existing Amazon S3 APIs. Together with an embedded database like ChaosSearch LakeDB, you can scale your capacity to deliver business-critical analytics based on your cloud-native applications’ data, transforming the user experience.
Want to make it both easy and fast to analyze your log and event data, on your terms? ChaosSearch transforms your Amazon S3 cloud object storage into a live analytics database with support for full-text search, SQL, and GenAI workloads, with no data movement and unlimited data retention. Using proprietary technology to aggregate diverse data streams into one data lake database, ChaosLakeDB automates data pipelines and schema management. It enables both real-time and historical analytics, with lower management overhead, no stability issues, no data retention tradeoffs, and up to 80% lower costs vs. alternatives like Elasticsearch.
These low costs, combined with the higher speeds of Amazon S3 Express One Zone, make it much easier for you to gain near real time data access at scale.