The agentic AI hype will be short-lived and expensive for thousands of companies, according to new research from Gartner, which warns just a small fraction of agent projects are “real” applications of the technology.

In its “Avoid Agentic AI Failure” report, Gartner said it would label just 130 of agentic AI vendors as real and predicted that 40% of agentic AI projects will fail by the end of 2027 as reality about limitations dawn.

Senior Director Analyst Anushree Verma said: “Most agentic AI propositions lack significant value or ROI, as current models don’t have the maturity and agency to autonomously achieve complex business goals or follow nuanced instructions over time."

Despite major investments in the technology, Bank of America predicted this month that spending on agentic AI could reach $155 billion by 2030, Verma said many of the use cases outlined by vendors "don't require agentic implementations."

The majority of current projects are simply "early stage experiments or proof of concepts", said Gartner's analysts, echoing claims about some start-ups, such as the now defunct Builder.ai, an AI-powered, low-code app builder, accused of using human engineers to do the majority of its work and obfuscating its process behind claims of a "human-assisted, AI powered assembly line."

What do researchers say?

Gartner’s claims will fuel existing agent scepticism and come as researchers express similar concerns.

One example: A new Carnegie Mellon University (CMU) study, which saw its team deploy a fully AI agent-staffed company (TheAgentCompany) to explore their capabilities in a real working environment.

After ‘hiring’ agents from Anthropic, Google, Meta and others, CMU's initial findings reported the best performing agents could only complete around 30% of the tasks given to them, often held up by their own attempts to skirt difficult tasks or struggles to navigate social and web-based interactions and still building up a large bill in the process.

See also: AI Agents: Rock, Paper, Scissors, Hype?

Researchers said that for some tasks "when the agent is not clear what the next steps should be, it sometimes tries to be clever and create fake "shortcuts" that omit the hard part of the task", while on other tasks they struggled to get pass popup windows when browsing the web, or follow up on advice to speak with a colleague.

Similarly, the Bank for International Settlement’s “Putting AI agents through their paces” report said agents were “impressive at narrowly defined tasks” but lacked “self-awareness” to change course when things went wrong while playing online games.

It argued the "primary limitation" of models such as Anthropic's Claude, which actually performed the best for CMU, was their "inability to self-criticise and self-correct", and said it will be crucial for models to overcome this hurdle if they are to "become self-improving" and take on complex human-like tasks.

A positive way forward?

It’s not all bad news for agents, Gartner also said it expected 15% of day-to-day work decisions to be made by agents, a small number but still up from 0% in 2024, and 33% of enterprise software applications to include agentic AI by 2028, though the ratio of applications to decisions made does raise more questions on productivity.

It has also previously given much more positive predictions, including a claim in March that agents would resolve 80% of common customer service issues by 2029 and reduce operational costs by 30%.

See also: Goldman Sachs CIO: The next challenge for AI agents? Getting them to fit in with OUR culture

A MIT and Harvard Business School study also recently found that AI agents can be trained to overcome their impractical adherence to policies and address tasks involving necessary exceptions, citing its work to explain to AI agents why a $10.01 bag of flour was still worth purchasing to bake a birthday cake with a $10 flour budget.

While researchers said the study highlighted a “critical misalignment” between LLM and human decision making processes, they also advocated for enhancing off-the-shelf LLMs before deploying them and said after “supervised fine tuning”, AI models could “generalise human-like decision-making to novel scenarios.”

For now though, Verma said companies must “focus on enterprise productivity, rather than just individual task augmentation” to get the most out of their agents.

The link has been copied!