The Power of Combining a Modular Security Data Lake with an XDR

Written by David Bunting | Aug 2, 2024

The 2024 Global Digital Trust Insights survey from PwC reports that 36% of businesses have experienced a data breach that cost more than $1 million to remediate. Cyber threats are clearly on the rise and in today’s volatile threat environment, it is a matter of when - not if - a cybersecurity incident will occur. Digital adversaries are becoming more sophisticated and relying on weak links to exploit company applications and infrastructure.

At the same time that digital adversaries are ramping up their attacks, SecOps teams are adapting to a new and ever-changing reality of cyber security. Thanks to innovations like cloud computing and remote work, SecOps teams are faced with the challenge of monitoring an increasing number of devices, infrastructure components, and applications outside the traditional network security perimeter. As a result, the sheer volume of security data security teams must capture, analyze and retain for security monitoring, threat detection/hunting, and incident response can easily overwhelm legacy security analytics tools.

This blog post explores an emerging approach to security analytics for SecOps teams: combining an Extended Detection and Response (XDR) platform with a modular security data lake to create an XDR data lake. We’ll also compare XDR data lake solutions with traditional security analytics tools like Security Information and Event Management (SIEM) systems that companies still use for security monitoring and threat detection in modern cloud-native environments.

What is a security data lake?

As threats and attack vectors multiply and increase in complexity, it is essential to store data longer and to bring in more data sources. A security data lake can help teams sift through the noise, investigate, respond, and mitigate real threats as they emerge, as well as look at the entire lifecycle of an incident comprehensively.

On top of that, a flexible layer of automation can drive analysis of the many data sources in a security data lake, assess risk, and engage security teams when necessary to provide human review of conditions.

Some tools, like SIEMs, face challenges when it comes to scale, cost and root cause analysis capabilities. We’ll cover those in more depth below. That’s why many security teams choose a data lake to separate storage from compute to reduce data storage costs without limiting retention. A modular security data lake can be built on top of low-cost cloud object storage like Amazon S3 or GCP. From there you can index and search log data and other application telemetry data at a lower cost and at scale.

AWS customers can build a cloud-based security data lake on top of cost-effective Amazon S3 cloud object storage. (Source)

Security data lakes help you centralize and store unlimited volumes of data so analysts don’t have to access logs across several different sources. This supports many security use cases, particularly threat hunting and detecting advanced persistent threats.

A security data lake can be deployed in tandem with an XDR platform (or a SIEM) to establish an XDR data lake solution that delivers on short-term analytics use cases (e.g. security monitoring, threat detection, incident response) while reducing the cost of long-term security data storage that supports use cases like security investigations, root cause analysis, and even regulatory compliance.

XDR vs. SIEM: What’s the difference?

Depending on the vendor, an XDR and a SIEM may have some overlapping capabilities. Let’s review the similarities and differences, including how an XDR could potentially replace a SIEM with the reinforcement of a security data lake. In some cases where a SIEM is already deeply embedded, complementing a SIEM (like Splunk) with a security data lake can be more cost-effective and performant — especially for security investigation workloads that require a large volume of historical log data.

SIEM and XDR solutions both ingest and analyze telemetry data to detect security threats. But while SIEM tools rely on integrations with SOAR and other tools for incident response, XDR platforms have built-in capabilities to protect network assets by adjusting defenses to neutralize potential threats. (Source)

What is a SIEM?

Many organizations use a SIEM software solution for security analytics and threat hunting. A SIEM analyzes log data and telemetry, and provides real-time alerts about potential incidents. Key capabilities of a SIEM typically include:

Collecting and aggregating security log data from IT infrastructure.
Storing security log data, usually for up to 30 days after collection.
Event correlation and security analytics
Proactive threat detection
Security alert generation based on predefined security rules and policies
Incident response orchestration
Security incident forensic analysis and root cause investigations
Compliance management and reporting.

An increasing number of SIEM products are incorporating AI and machine learning algorithms that can identify current or potential future cyber threats by ingesting and analyzing huge amounts of historical threat data.

Drawbacks of a SIEM

The sheer volume of cloud telemetry data can make SIEM systems impractical as the sole security analytics tool for many organizations. Some of the most common challenges include:

Cost and Maintenance: The upfront costs, along with the costs of setting up, fine-tuning and maintaining a SIEM can add up. This includes the cost of data retention and processing. Security analysts often need to manually tag and continually update these systems in order to keep signals accurate, which can take a lot of time for companies facing a security talent shortage.
Data Retention: Many SIEMs only retain data for 30 days due to cost issues, which limits the ability to investigate advanced persistent threats.
Alert Fatigue: SIEMs throw a lot of false positives, which makes it hard for security analysts to distinguish between signal and noise within their real-time alerts.

What is an XDR?

Security attacks today are increasingly sophisticated and rarely exploit a single endpoint. An XDR can move beyond the limits of a SIEM by providing comprehensive monitoring of an organization’s entire attack surface, including endpoint devices and cloud infrastructure outside the enterprise network perimeter. Having this broader visibility means that an XDR can identify more patterns in your data to detect potential threats. The goal is to help security teams correlate seemingly disconnected events, to take immediate action and mitigate cybersecurity threats.

Leading XDR solutions available today include Cortex XDR by Palo Alto Networks, Singularity XDR by Sentinel One, and Intercept X Endpoint XDR by Sophos.

XDR vs. SIEM

Both an XDR and a SIEM are designed to collect and analyze security data within a central location. However, a SIEM and XDR are different in a few key ways:

Focus: SIEM solutions primarily offer centralized log management, alerting and SIEM analytics capabilities. XDR focuses on contextualizing the data it collects to enhance threat detection and response.
Complexity: As covered above, SIEM solutions often require ongoing management and fine-tuning. XDR solutions are designed to integrate into an organization’s security architecture, like a security data lake, to provide useful alerts.
Incident Response: While a SIEM can provide security operations center (SOC) analysts with the data and alerts required to identify potential threats, an XDR also includes the ability to support and coordinate response efforts.
Costs: An XDR data lake can cost significantly less when it comes to data ingestion and retention. This is particularly important for supporting long-term use cases for security log data, such as security incident root cause analysis and advanced persistent threat hunting.

Why you should build an XDR data lake

Between SIEM, SOAR (security orchestration, automation and response), XDR and other security monitoring and analytics technologies, it can get confusing and costly for organizations — fast.

In many cloud-native environments, combining an XDR with a security data lake can enable longer data retention periods and long-term security analytics use cases at a lower cost than a SIEM (especially if a SIEM is not already in place).

Remember how SIEMs often cause alert fatigue, due to the overwhelming amount of individual alerts triggered by the system? These can make it difficult to understand which threats need immediate attention. Layering an XDR with a security data lake can result in better coverage by providing multiple layers of security. An XDR can correlate and connect log data for context into security events.

The deep activity data within an XDR can be fed into a security data lake to enable more extensive threat-hunting and investigation capabilities. This reduces the total cost of ownership (TCO) of a security analytics solution, making it more cost efficient and performant to conduct deeper searches of log data that extend beyond 30 days.

Power Your XDR Data Lake with ChaosSearch

A security data lake powered by ChaosSearch can reduce the TCO of a SIEM tool like Splunk or provide an alternative to Datadog for security log analytics at scale. Sending critical security logs to ChaosSearch can reduce data storage and maintenance costs of a SIEM solution while overcoming limits on data ingestion or retention. As an added benefit, log data stored and indexed with ChaosSearch can be analyzed to support DevOps and ITOps use cases like cloud infrastructure observability and user behavior analysis.

For companies that don’t have a SIEM in place, or are frustrated by the limitations of a SIEM, it may be more economical and effective to build an XDR data lake that combines contextually aware alerting from an XDR solution with long-term data retention from a security data lake, enabling deeper threat hunting and security investigation capabilities.

An XDR data lake can support regulatory compliance requirements by enabling data retention beyond the 30-day window that’s standard for many SIEMs.

Ultimately, there’s no one tool to rule all in a cloud-native world, but an XDR data lake consisting of a modern XDR platform backed by a security data lake like Chaos LakeDB can deliver on security analytics use cases at a dramatically reduced cost compared to a traditional SIEM.

Ready to learn more?

Watch our free on-demand webinar ChaosSearch + Datadog: Better Together to see how our customers are reducing their Datadog Cloud SIEM costs by 30% and enabling long-term security analytics use cases by leveraging ChaosSearch for unlimited data retention.

View full post