Big Data, Graph, and the Cloud: Three Keys to Stopping Today’s Threats

An edited version of this blog was published as an article in Teiss on December 12, 2018.

Graph databases are having a bit of a moment in cybersecurity. With recent releases from industry juggernauts such as Microsoft and Google/VirusTotal, it seems “graph” could be poised to take center stage as a security industry buzzword and the next must-have cybersecurity technology. However, using a hot technology doesn’t deliver value by itself; it’s what you do with it that matters.

More than five years ago, CrowdStrike® Threat Graph™ became the industry’s first purpose-built graph database for cybersecurity, leveraging the power of the cloud to deliver on the promise of graph technology. Today, Threat Graph remains the largest and most sophisticated solution of its kind. This blog explores how graph technology is being applied to cybersecurity problems and how CrowdStrike has taken advantage of it to help our customers stop breaches.

What Is a Graph?

Graph technology represents a shift in how data is stored and retrieved from databases. A graph database captures individual records (or “nodes,” in graph terminology) that have freeform properties —as well as potentially complex relationships between them — and connects them via “vertices.”  Graph databases excel at executing queries that require understanding patterns and connections between different types of data.

As an example, imagine a restaurant recommendation engine built on a graph database. It might have nodes that describe you, your friends, your favorite restaurants and each of your hometowns. If you’re travelling to Texas and want to know your friends’ most highly recommended barbecue restaurant in Austin, a query via a properly constructed graph database uncovers the answer effortlessly. Graph database technology is at the core of Facebook’s Social Graph, Google’s Knowledge Graph, Twitter and many other “big data” platforms.

Graph is a natural technology for security. Attackers are adept at hiding their activity in the noise and using native tools that are difficult to separate from normal user activity.  Stopping today’s threats requires continuous visibility into what is happening, and enough context to understand why. To use an example from the real world, if you witnessed someone stealing a purse from someone on the street, you might quickly call the police to report a crime in progress. However, if you also saw movie cameras and a crew capturing the scene, you’d likely come to the conclusion that you were watching the production of a summer blockbuster. Connecting the dots with context enables good decisions.

Today’s best techniques for detecting modern threats depend on collecting massive amounts of telemetry from endpoints, enriching it with context, and mining this data for signs of attack with a variety of analytic techniques. These analytics may come in the form of machine learning models, trained from massive historical data sets and relationships. Analytics may also reflect specific chains of behavior learned via real-world adversary encounters. Graph databases make it possible to apply many different types of analysis simultaneously, in real time,  and at very large scale.

Graph databases also make human analysts much more efficient when performing security investigations and proactive threat hunting. Investigations typically start with an alert of some kind and a pile of questions: Who did this? What led up to it? What happened after? Have similar things been observed anywhere else? Finding answers to questions like these in a relational database requires sophisticated indexing and resource-intensive table searches. In a graph database, the answers to these and hundreds of other questions are built directly into the structure of the database itself. Graph databases help security analysts get instant answers to questions that might take hours or even days with solutions that rely on traditional relational database structures.

Cloud Changes the Game

Of course all of this is possible only if you have sufficient quantity and quality of data in your graph, along with the right tools to extract the insights you need from that massive pile of raw data points. Historically, this has been prohibitively expensive, requiring vast computing resources and a skilled staff, which made graph technology promising but impractical. Moving security to the cloud changes that dynamic. The CrowdStrike Threat Graph database has been at the core of the cloud-native Falcon platform since its inception, allowing enterprises to successfully stop breaches through the power of big data, analytics and the cloud.

Collecting the Right Data

CrowdStrike Threat Graph continuously collects more than 400 different types of endpoint behavior, spanning Windows, Linux, and macOS, from both user space and kernel space. Bringing more than a trillion events into the Threat Graph every week gives CrowdStrike solutions a broad and deep pool of diverse data in which to hunt for threats. This continuous, comprehensive data collection ensures our customers have the total visibility and context needed to identify sophisticated threats, while avoiding the critical flaw found in solutions that only record data if it appears related to an attack.

Delivering Insights

Threat Graph actively and automatically enriches and processes this data to reveal and block the most relevant threats in real time. Threat Graph applies multi-layered analysis, including artificial intelligence (AI) and behavioral techniques, and makes more than 3.5 million blocking decisions every second of every day.

In addition, the CrowdStrike OverWatch™ team of experts proactively hunts in this dataset, leveraging human experience to find the most sophisticated attacks. They can work faster because of the speed and efficiencies of the graph database, and they can work smarter because they are leveraging enriched global telemetry. The power of Threat Graph allowed CrowdStrike to stop more than 25,000 breaches last year through our proactive threat hunting efforts.

Providing Frictionless Access

Finally, Threat Graph provides analysts and integrators with real-time, forensic-level visibility into all endpoint activity, no matter how large the organization or how complex the query. This empowers incident responders and threat hunters to understand threats and take quick, decisive actions. Rich, open APIs ensure that organizations have a clear path to cyber maturity, by integrating Threat Graph data and workflows across the entire security operations center.

Graph technology has a clear role in your security arsenal. It processes mountains of data and quickly distills actionable insights. The CrowdStrike Threat Graph team is proud to continue leading the way.

Learn more:

Related Content