Skip to content

Search the site


Microsoft just released an “air-gapped” LLM for spies: GCHQ doesn't sound wild about the tech...

"Every time someone breathlessly gets excited about how LLMs do reasoning, it turns out it was..."

Obligatory ludicrous stock image, credit: Sammy Williams, via Unsplash.

Microsoft has deployed an air-gapped Large Language Model (LLM) based on GPT-4 for the sole use of US intelligence agencies, its CTO for strategic missions and technology William Chappell said – apparently making good on a 2023 promise to launch an “initial preview of Azure OpenAI Service in Azure Government for government agencies” within Q1 of this year. 

Bloomberg, first to report on Microsoft’s achievement, described it gushingly as a “generative AI model entirely divorced from the internet” that the intelligence community can use to “safely harness the powerful technology to analyze top-secret information” and said that Redmond overhauled “an existing AI supercomputer in Iowa” for its infrastructure.

“There is a race to get generative AI onto intelligence data,” Sheetal Patel, assistant director of the CIA for the Transnational and Technology Mission Center, Bloomberg cited Patel saying (possibly speaking backwards and meaning get intelligence data onto generative AI) at a recent conference.

The first country to use generative AI for their intelligence would win that race, she said, per Bloomberg this week: “And I want it to be us.”

GCHQ's Chief Data Scientist sounds LLM-sceptical

Her friends across the Atlantic in the UK intelligence community technology leaders seem less enamoured by the technology. 

Writing in late 2023, “Adam C”, Chief Data Scientist at UK signals intelligence and cybersecurity agency GCHQ, concluded cuttingly that “Current LLMs show promising potential as basic productivity assistants to improve efficiency of some repetitive intelligence tasks. 

“But the most promising use cases are still on the horizon, and future efforts should focus on developing models that understand the context of the information they are processing – rather than just predicting what the next word is likely to be,” he wrote in a jointly-authored Sep 2023 article. 

See also: “Barfing” code, RAG stacks and other AI lessons from LinkedIn

Speaking in April 2024 at the Alan Turing Institute, “Adam C” was even more downbeat, according to Economist Defence Editor Shashank Joshi.

Joshi quoted him on X as saying “every time someone breathlessly gets excited about how LLMs do reasoning… it turns out it was an example of memorization...I don't know of any counter examples of that. I do think there's something… structural about LLMs that makes this problematic… LLMs are a really sketchy technology for analysts who have a profound obligation to be right, Joshi cited him on April 26 as saying at Turing.

“I'm not actually sure...this is something we're going to be talking about in 2-4 years from now" but AI is still "an incredibly valuable back-end analytic capability" he reportedly added – having in his 2023 article noted that “whilst LLMs are designed to be able to ‘hold attention’ on a line of reasoning, to be useful in intelligence work they will need to be able to support complex reasoning that may be lateral and counterfactual. 

See also: UK’s spy agencies grapple with IT modernisation–  as GCHQ ramps up hacking capabilities

“It is unlikely that the state-of-the-art in LLMs can achieve this, as counterfactual reasoning is reliant on modelling relationships between entities in the real world. The development of hybrid architectures such as neurosymbolic networks, that combine the statistical inference power of neural networks with the logic and interpretability of symbol processing, appears to offer the most potential here.”

“We would encourage further research within the national security community on such promising techniques” Adam C wrote at the time.

Among other undoubtedly useful things, LLMs are also exceptionally good at burning committed but undeployed customer spend in the cloud, as one hyperscaler executive recently noted candidly to The Stack. (Max out your token quota limit on Anthropic’s Claude 2.x on a hyperscaler and you could easily rack up a $46,000 bill in a single day, as cloud security specialist Sysdig recently noted in a compelling piece of research on an early stage “LLM jacking” attempt that it identified the footprints of.) 

Microsoft’s “air-gapped” service went live on Thursday. It now faces testing and accreditation by the intelligence community, Bloomberg said. 

See also: “Scant evidence” – Google’s AI chemistry claims misleading