Make Your AWS Data Lake Deliver with ChaosSearch (Webinar Highlights)
When CTO James Dixon coined the term “data lake” in 2011, he imagined a single storage repository where organizations could store both structured and unstructured data in their raw format until it was needed for analytics.
But without the right storage technology, data governance, or analytical tools, the first data lakes quickly became “data swamps” - morasses of data with no organizational structure and no efficient way to access or extract meaningful insights.
These early challenges led many organizations to move away from data lakes and focus on alternative solutions for data storage and analytics - until now.
It’s Time to Make Your AWS Data Lake Deliver
Our newest webinar, Make Your AWS Data Lake Deliver, explores the underlying philosophy and bleeding-edge technology that’s driving a new generation of productive and efficient data lakes powered by cost-effective Amazon S3 storage.
- Kevin Miller, Vice President and GM of Amazon S3 at AWS
- Mark Hill, Senior Director of IT Operations at Digital River
- Thomas Hazel, Founder, and CTO at ChaosSearch
- Ed Walsh, CEO at ChaosSearch
In three jam-packed informational segments, our guests shared their unique perspectives. They offered exclusive insights into the state of data lakes today, the impact that modern data lakes can have on cloud-first businesses, and the future of cloud object storage - you won’t want to miss it!
Our blog this week recaps just a few highlights from the conversations that took place during this hour-long webinar event.
Hoping to dig deeper into the details? Check out the entire webinar here.
Kevin Miller Shares Data Challenges and Opportunities for Amazon S3 Customers
Amazon S3 has become a de facto standard for modern data lake storage, thanks to its seamless scalability, durability, and cost-effective characteristics. As Vice President and General Manager of Amazon S3, our first guest Kevin Miller offers a true insider’s perspective on the future of data lakes in the cloud.
Understanding Customer Challenges
As the leader of Amazon S3, Kevin Miller holds a deep understanding of the challenges his customers face when it comes to developing insights and extracting the full value of data stored in Amazon S3.
When asked to shed light on some of those challenges, Miller shared that customers are increasingly focused on business transformation when it comes to making use of their data. Unlike prior optimizations that focused on efficiency and improving the bottom line, customers are now thinking outside the box and looking for data-driven ways to drive top-line revenue, whether by generating new product ideas or accelerating customer acquisition.
Opportunities Using Amazon S3 with ChaosSearch
ChaosSearch sits on top of Amazon S3, transforming your cloud object storage into a hot data lake and enabling analytics at scale with no ETL process, no data movement, and no data retention trade-offs. As a result, customers of AWS + ChaosSearch can retain more of their data and accelerate insights that drive innovation.
When asked about opportunities and desired outcomes for AWS + ChaosSearch customers, Miller told us that customers want to reduce the time it takes for data-driven initiatives to deliver measurable results that improve the business.
“ChaosSearch cuts down on what I’d call undifferentiated heavy lifting - the work to index, organize, and catalog data,” Miller says, “With ChaosSearch, it’s that automatic indexing, being able to index data in S3 buckets, keep the index fresh, and allow customers to innovate with their data.”
Watch the entire webinar to hear more from Amazon S3 VP and GM Kevin Miller, including updates to S3 Intelligent Tiering, analyzing big data at scale, and the future of cloud object storage.
Mark Hill Pulls Back the Curtain on Digital River’s Data Lake Transformation
Five years ago, eCommerce platform Digital River launched a cloud migration initiative that would shut down its ten global data centers, migrate its operations onto AWS, and reshape Digital River into a cloud-native business.
As Senior Director of IT Operations at Digital River for the past three years, our second guest Mark Hill had plenty to share about the company’s journey into the cloud and the adoption of modern data lake technology.
Cloud Transformation at Digital River
Digital River’s migration into the cloud was a two-year process with significant challenges, especially when consolidating data from data centers in multiple regions.
When the migration was complete, Hill explains, Digital River adopted a “cloud-first” vision, choosing to consume cloud Infrastructure-as-a-Service (IaaS) and to purposefully evaluate and adopt Software-as-a-Service (SaaS) solutions. In doing so, Digital River shifted its focus away from time-consuming operational tasks and put resources into revenue-generating products.
Leveraging ChaosSearch for Analytics at Scale
Digital River previously relied on a 10-year-old ELK stack and self-managed analytics and reporting solution that was both costly and difficult to maintain.
Over time, Hill explains, his team made the conscious decision to save costs by capturing only essential data and limiting data retention to seven days. These data retention limits resulted in the loss of important data points used for incident triage, problem management, trending, and other use cases.
“We were losing out,” Hill says, “if there was a low-priority incident and people didn’t investigate until 8-9 days later, all of the logs are gone, and there’s no way to find the root cause or resolve the incident.”
After adopting ChaosSearch, Hill and his team at Digital River could circumvent those data retention trade-offs and have more of their data available at a lower cost. Additionally, Hill commented, “ChaosSearch has offered a manageable and cost-effective opportunity to store months or even years of data that we can use for operations, as well as trending, automation, and pushing into an event-driven architecture so we can proactively manage our services.”
Check out the full webinar to hear everything Mark Hill had to say about
- The challenges of working with legacy analytics software
- The benefits of migrating to the cloud
- Emerging data lake use cases at Digital River
How To Find What’s Hidden in Your Data Lake
The final segment of Make Your AWS Data Lake Deliver wraps up the webinar with a live interview hosted by Dave Vellante featuring Thomas Hazel and Ed Walsh.
The trio touches on a variety of topics, from the history of data lakes and data lake philosophy to the capabilities of modern data lakes and the future of data mesh architectures.
What’s Hidden in Your Data Lake?
Historically, it has been a challenge for enterprises to uncover the hidden insights in their data lakes. Even when it’s been easy to ingest data into the lake itself, the cost and complexity of normalizing the data, building schema, and doing ETL have prevented organizations from analyzing much of the data they collect and store.
The data continues to pile up, but its value remains hidden behind a wall of complexity and toil.
When asked why enterprises are still struggling to get insights from their data lakes, Ed Walsh reminds us that ChaosSearch was founded to address that exact problem. “We haven’t changed the way we do data prep since the 2000s,” says Walsh, “it takes between 3 weeks and 3 months to process ETL data requests and uses the same skill set of people that you want driving digital transformation, data warehousing initiatives, and modernization. These are people you don’t have enough of in the enterprise.”
With ChaosSearch, Walsh explains, you put your data in S3, don’t move it, and don’t transform it. “In fact, we’re against data movement,” he says, “simply point us at that data, and we index that data and make it available in a representation that lets you give virtual views to users. We get rid of the physical ETL, which is 80% of the work.”
ChaosSearch’s Strategic Partnership with AWS
Amazon S3 provides the most cost-effective data storage solution for enterprises, so it always made sense for ChaosSearch to launch its data lake platform on AWS.
Now, ChaosSearch and AWS are collaborating via AWS Partner Network (APN) programs to amplify ChoasSearch’s message about the value and benefits of leveraging Amazon S3 + ChaosSearch for analytics at scale.
“I always believed in data lake philosophy…,” Thomas Hazel says, “...however, HDFS wasn’t really a service. Cloud object storage is a service - the elasticity, the security, the durability, all those benefits are really why we founded on cloud object storage as a first move.”
Curious how Ed Walsh and Thomas Hazel explain the data lake philosophy and inspiration behind ChaosSearch? Check out the full webinar.
Ready to make your AWS data lake deliver?
Read the Blog: 10 AWS Data Lake Best Practices
Check out the Press Release: ChaosSearch Achieves AWS Data and Analytics Competency Status
Watch the Webinar: Raucous re:Invent Recap 2021
Download the Whitepaper: DevOps Forensic Files: Using Log Analytics to Increase Efficiency