Darktrace has built a security platform capable of mapping threats across sprawling cloud environments in real time. But its engineers faced a fundamental architectural question: What kind of database could handle the relational complexity of cloud-based enterprise infrastructure?
The Cambridge-based cybersecurity company, which was acquired by private equity firm Thoma Bravo for $5.3 billion in 2024, chose Amazon Neptune, a managed graph database service. Darktrace’s Stephen Pickman, SVP of Engineering, and Adam Stevens, Senior Director of Product for Cloud, sat down with The Stack to explain why – and the trade-offs involved.
When relational databases hit their limits
Graph databases address a class of problem that relational systems handle poorly: queries that traverse multiple degrees of relationship. In security contexts, as well as fraud detection, supply chain mapping, or identity management, these multi-hop queries are often the entire point.
Darktrace uses a graph database to understand the relationships between the resources in their client's cloud environment, and to map attack paths. This, said Stevens, means among other benefits, that the company’s cloud visibility data and posture management findings can sit together.
"You could do all of this in a relational database like Postgres, for instance.
“The problem then becomes one of scale," Pickman explained.
"The questions you want to ask of this data are ‘show me the nearest neighbour, show me the twice-removed nearest neighbours.’ Trying to do that in a classic relational database is a very difficult problem to solve, both from an efficiency point of view, but also just the kind of storage."
"Security teams need to have context and you can infer lots of different pieces of context and relationships by looking at the nodes [Nodes are vertices that store the data objects. Each node can have an unlimited number and types of relationships] in the graph database," Pickman added.
Buy-versus-build: graph database edition
Darktrace's decision to use a managed graph database rather than self-hosting was simple: whilst open source alternatives are available, the operational burden of maintaining a new database would divert engineers from product development. The company would rather focus precious engineering resource on adding value for and securing its own customers.
Stevens said, "[Using Amazon Neptune] allows us to quickly build and progress that product without having the sort of overhead that you might otherwise have if you were running your own graph database."
Pickman was blunter about the alternative. "Having to manage your own graph database [is] costly; quite resource-intensive.”
By contrast, letting AWS handle graph database management meant Darktrace engineers don’t have to worry about constantly optimizing for latency or scale. And the relationship extends beyond infrastructure…
"[We have a] very positive relationship with the Neptune product team," Pickman added – a view common to Neptune’s customers. "They've been fantastic at helping us onboard, answering our questions."
Graph database dos and don'ts...
Darktrace's experience also offers a cautionary note.
Graph databases excel at mapping and tracking complex evolving relationships and behaviours, but they are not universal data stores.
Treating them as such can undermine performance – and sometimes that requires learning on the fly for new users of them: "One of the issues we have hit with Neptune is trying to put too much data onto the individual nodes and edges themselves," Pickman acknowledged.
His recommendation: use graph databases for what they do uniquely well, and offload everything else. "My main takeaway has been Neptune is a graph database, and it's fantastic at being a graph database.
“If there's significant data that you want to store, either on those nodes as labels or on that edge data, you might want to consider pulling that out into, for example, an RDS Postgres solution or maybe S3."
Infrastructure that pre-empts agentic workloads
Perhaps the most forward-looking element of Darktrace's strategy involves preparing for a security landscape where autonomous AI agents operate across enterprise systems. As security researchers have noted, agents need to be able to interact with multiple systems, creating new authorization challenges that traditional identity controls were never designed to address.
Pickman sees graph infrastructure as foundational to this emerging problem. "It's about bringing this context together in a queryable fashion. Not only does that benefit the security context, but also [helps] AI agents understand how various pieces of knowledge are interrelated."
"Neptune is going to be likely a key part of that, as we look to understand what these agents have access to," Pickman said.
Stevens framed the challenge in familiar security terms. "It's still about knowing where those agents are, what they've got access to, what sort of permissions you've given it, what data they're accessing."
For enterprise IT leaders, the lesson may be that graph database investments made today for one established use case, like security posture management, fraud detection or identity mapping, could prove foundational for AI workloads that are emerging fast.
Delivered in partnership with Amazon Neptune