How companies are unlocking cheaper AI compute at scale

Hardware and energy bottlenecks are slowing the pace of AI deployment. Instead of throwing cash at Nvidia and AMD for the hottest semiconductors, enterprises are turning to software hacks that let them run models and inference on lower-cost chips.

GPUs are scarce, expensive, power-hungry and can be underutilised, yet enterprise companies still want to ship AI tools and features to stay competitive. A need for GPUs to run AI workloads is pushing enterprises toward alternative sourcing or inference-as-a-service, according to Nick Patience, AI lead at research firm The Futurum Group.

“The more persistent problem for most organisations is that companies commit too early and too much — signing large contracts but seeing very low internal adoption. That gap between contracted capacity and actual utilisation is where a lot of money is quietly being lost,” he said.

Get the full story: Subscribe for free

Join peers managing over $100 billion in annual IT spend and subscribe to unlock full access to The Stack’s analysis and events.

Subscribe now

Already a member? Sign in