In the security world, dwell time is the time a threat actor is present and undetected in a network. It is a common metric used in cyber-security. In 2017 Mandiant reported that the dwell time in a corporate environment was about 99 days. Most of the customers we talk to only keep 7 to 14 days worth of data in their Elasticsearch clusters. The driving force behind why customers maintain such a small amount of data and a short retention window is due to the high cost of running an Elasticsearch cluster or ELK stack.
Moreover, these costs are not only the hard operating costs of Amazon EC2 and Amazon EBS disks. They also include the time and energy of your engineers who need to provide care and feeding to keep the systems running. You could leave those events in their raw form in storage like Amazon S3, but when a security event happens, how do you get answers from your data?
Amazon Athena let’s you run SQL queries on your security and compliance event, but there is still a setup required with schema creation before you can ask your first question. The companies we've spoken with who are struggling with Athena tell us that just managing the cost of every query is a big challenge for them. Athena lacks an integrated visualization platform, so operators are required to spend more time and energy deploying tools like Tableau or integrating with Amazon Quicksight.
The cost difference between a $3.00 Athena query and a $300 Athena query could be as simple as a poorly formed SQL statement.
So then how do people analyze their data and ask questions? Sadly, many have to spend time and energy on ETL (extract - transform - load) frameworks to process their data into tools like Amazon MapReduce. In many ways, Athena can replace that solution. However, you now have a situation where you still must deploy or integrate a visualization tool to aggregate and visualize your data. While Amazon removes the complexity of server deployment for building a data lake, there are still a lot of moving pieces when it comes to integrating Amazon S3, Glue, EMR, Athena, and other services to get answers to your questions.
The inherent complexity and the limitations of some Amazon tools are why we have seen customers and prospects solve this problem by deploying Elasticsearch, Logstash, and Kibana (more commonly referred to as the ELK stack). The ELK stack is a fantastic suite of open-source tools that lots of companies use for their log and event storage problems. While many companies think they are getting a free solution to their problems when leveraging open-source technologies, they forget the familiar adage.
Open-source is only free if your time is worth nothing.
What we've heard from our customers who have deployed the ELK stack for their security observability is that they must choose between retention and their ever-growing AWS bills. In many cases, these customers have implemented a fixed number of servers to handle their current security log and event load. In one case a customer started with 90 days of retention for their log data. Then over the year as their data volume increased they slowly shrunk their retention lower and lower so that they didn't have to spend more on their sprawling Elasticsearch cluster. A few months later and their log data growth was so significant that they were down to only a week of data retention.
You can probably see where this story is going - months later the company identified a breach but any logs that would have given them insight into the attack, where did the attacker go and what data did they get, were long gone. With CHAOSSEARCH, now this customer gets to keep all their security and compliance logs on their own Amazon S3. Furthermore, they can go back in time, any time, to find answers to their questions. Also, they can do this all without having to run and maintain ANY database servers.
One of the key features of the CHAOSSEARCH platform is the ability to index fully every single field in your documents. With Elasticsearch you have to pick and choose what to index to meet your index size requirements. With data that is indexed by CHAOSSEARCH, every field is available for hunting and searching for answers. Since every single event is fully indexed, our customers can send the original copy of the log and security event to Amazon's Glacier service for even more significant savings to their storage bills and to meet their legal and compliance needs.
The long term retention of security and compliance logs will continue to be a major initiative and focus for companies - especially as regulations continue to evolve with the growing cyber threat landscape. As more and more high profile breaches show up in the news, companies will be under pressure to retain as much of their security and compliance events as possible to answer "who did what and when."
CHAOSSEARCH is the first service that can turn your data on Amazon S3 into a searchable Elasticsearch cluster. You can leverage the low cost and secure data storage of Amazon S3 to retain an unlimited amount of your security and event data. Also, because we have extended the Elasticsearch API on top of your data on your Amazon S3, your engineers can continue using the tools they know and love like Kibana to get immediate answers to their questions, no matter when the event happens. It's all there, all on your Amazon S3 for hunting, query, and visualizing anytime.
Interested in learning how your company can get access to months and years worth of your log and event data using the power of your Amazon S3 infrastructure, reach out below and we'd love to chat more about your challenges and the problems you are trying to solve.