Blackpoint Cyber Taps ChaosSearch to Improve ThreatOps and Drive Growth! Check out the video-->
Blackpoint Cyber Taps ChaosSearch to Improve ThreatOps and Drive Growth! Check out the video-->
Start Free Trial

ChaosSearch Blog


The Power of Combining a Modular Security Data Lake with an XDR

The Power of Combining a Modular Security Data Lake with an XDR

The average cost of a data breach is expected to hit $5 million in 2023. For many organizations, it is a matter of when, not if, a cybersecurity incident will occur. Attackers are becoming more sophisticated and relying on weak links to exploit company applications and infrastructure.

Combine this phenomenon with the fact that the traditional network security perimeter has changed (and all but disappeared). Cloud computing and remote work have driven this trend. As a result, the sheer volume of telemetry data security teams must analyze and retain for threat detection, incident response, and threat hunting can easily overwhelm legacy security analytics tools.

This blog post will look at how tools like Security Information and Event Management (SIEM) systems perform in cloud-native environments, and compare them with emerging approaches such as combining an Extended Detection and Response (XDR) with a modular security data lake.


Modular Security


What is a security data lake?

As threats and attack vectors multiply and increase in complexity, it is essential to store data longer and to bring in more data sources. A security data lake can help teams sift through the noise, investigate, respond, and mitigate real threats as they emerge, as well as look at the entire lifecycle of an incident comprehensively.

On top of that, a flexible layer of automation can drive analysis of the many data sources in a security data lake, assess risk, and engage security teams when necessary to provide human review of conditions.

Some tools, like SIEMs, face challenges when it comes to scale, cost and root cause analysis capabilities. We’ll cover those in more depth below. That’s why many security teams choose a data lake to separate storage from compute. A modular security data lake can be built on top of low-cost cloud object storage like Amazon S3 or GCP. From there you can index and search log data and other application telemetry data at a lower cost and at scale.

Security data lakes help you centralize and store unlimited amounts of data so analysts don’t have to access logs across several different sources. This supports many security use cases, particularly threat hunting and detecting advanced persistent threats.

A security data lake can be used in tandem with an XDR (or a SIEM, in many cases) to detect and respond to cloud-based security threats.


XDR vs. SIEM: What’s the difference?

Depending on the vendor, an XDR and a SIEM may have some overlapping capabilities. Let’s review the similarities and differences, including how an XDR could potentially replace a SIEM with the reinforcement of a security data lake. In some cases where a SIEM is already deeply embedded, complementing a SIEM (like Splunk) with a security data lake can be more cost-effective and performant — especially for security investigation workloads that require a large volume of historical log data.


What is a SIEM?

Many organizations use a SIEM for security analytics and threat hunting. A SIEM analyzes log data and telemetry, and provides real-time alerts about potential incidents. Key capabilities of a SIEM may include:

  • Event correlation and analytics
  • Incident monitoring and security alerts
  • Compliance management and reporting.


Drawbacks of a SIEM

The sheer volume of cloud telemetry data can make SIEM systems impractical as the sole security analytics tool for many organizations. Some of the most common challenges include:

  • Cost and Maintenance: The upfront costs, along with the costs of setting up, fine-tuning and maintaining a SIEM can add up. This includes the cost of data retention and processing. Security analysts often need to manually tag and continually update these systems in order to keep signals accurate, which can take a lot of time for companies facing a security talent shortage.
  • Data retention: Many SIEMs only retain data for 30 days due to cost issues, which limits the ability to investigate advanced persistent threats.
  • Alert fatigue: SIEMs throw a lot of false positives, which makes it hard for security analysts to distinguish between signal and noise within their real-time alerts.

Read: Why Midsized SecOps Teams Should Consider Security Log Analytics Instead of Security and Information Event Management


What is an XDR?

Security attacks today are increasingly sophisticated and rarely exploit a single endpoint. An XDR can move beyond the limits of a SIEM by providing comprehensive monitoring of the entire attack surface. Having this broader visibility means that an XDR can identify more patterns in your data to detect potential threats. The goal is to help security teams correlate seemingly disconnected events, to take immediate action and mitigate cybersecurity threats.



Both an XDR and a SIEM are designed to collect and analyze security data within a central location. However, a SIEM and XDR are different in a few key ways:

  • Focus: SIEM solutions primarily offer centralized log management, alerting and analysis capabilities. XDR focuses on contextualizing the data it collects to enhance threat detection and response.
  • Complexity: As covered above, SIEM solutions often require ongoing management and fine-tuning. XDR solutions are designed to integrate into an organization’s security architecture, like a security data lake, to provide useful alerts.
  • Incident Response: While a SIEM can provide security operations center (SOC) analysts with the data and alerts required to identify potential threats, an XDR also includes the ability to support and coordinate response efforts.
  • Costs: An XDR combined with a security data lake can cost significantly less when it comes to data ingestion and retention. This is particularly important for threat hunting use cases with advanced persistent threats.


Using XDR Instead of SIEM


Using a modular security data lake with an XDR

Between SIEM, SOAR (security orchestration, automation and response), XDR and other security monitoring and analytics technologies, it can get confusing and costly for organizations — fast. In many cloud-native environments, an XDR plus a security data lake can be better than a SIEM (if a SIEM is not already in place).


The advantages of a security data lake vs. a SIEM

As mentioned above, a security data lake is often used in tandem with a SIEM to reduce the cost associated with ingesting and storing a high volume of log data and telemetry data, which is typical in cloud-native environments. Using a SIEM for real-time detection and a security data lake for deep investigation can be a great way to reduce costs and play to the strengths of each of these security analytics solutions.

Some of the advantages of a security data lake include:

  • Data retention and storage costs: A security data lake can help you leverage existing data lake investments built on cloud object storage, and retain telemetry data at a lower cost.
  • Ease and low cost of data ingestion: A security data lake can ingest data from multiple sources easily, augmenting some of the capabilities of a SIEM with less ongoing maintenance.
  • Ease of data indexing and search for incident response: Security teams can layer on a solution like ChaosSearch to index and search data within S3 or GCP without data movement, making security investigations much simpler to conduct.


Unbundling the SIEM

Layering an XDR with a security data lake can result in better coverage than a SIEM alone, by providing multiple layers of security. Remember how SIEMs often cause alert fatigue, due to the overwhelming amount of individual alerts triggered by the system? These can make it difficult to understand which threats need immediate attention. An XDR can solve these issues by correlating and connecting log data to gain context into a security event.

The deep activity data within an XDR can be fed into a security data lake for more extensive threat-hunting and investigation capabilities. This reduces the total cost of ownership (TCO) of a security analytics solution, making it more cost efficient and performant to conduct deeper searches of log data that extend beyond 30 days.



A security data lake powered by solutions like ChaosSearch can reduce the TCO of tools like a SIEM. Sending critical security logs to ChaosSearch can reduce storage and maintenance costs of a SIEM, without limits on ingest or retention. For companies that don’t have a SIEM in place, or are frustrated by the limitations of a SIEM, it may be more economical and effective to use an XDR with a security data lake for contextually aware alerting paired with deeper threat hunting and investigation capabilities.

As an added benefit, a security data lake can improve compliance by retaining data beyond the 30-day window many SIEMs implement. Ultimately, there’s no one tool to rule all in a cloud native world. A modular security data lake can dramatically reduce costs of traditional security solutions like a SIEM, while allowing for added context of an XDR.


Want to learn more about threat hunting?

Download the Threat Hunter's Handbook

About the Author, David Bunting

David Bunting is the Director of Demand Generation at ChaosSearch, the cloud data platform simplifying log analysis, cloud-native security, and application insights. Since 2019 David has worked tirelessly to bring ChaosSearch’s revolutionary technology to engineering teams, garnering the company such accolades as the Data Breakthrough Award and Cybersecurity Excellence Award. A veteran of LogMeIn and OutSystems, David has spent 20 years creating revenue growth and developing teams for SaaS and PaaS solutions. More posts by David Bunting