inference

| AI | May 06, 2026

The AI tool Google says can speed up LLM inference by 3x

DeepMind turned to a 2022 paper to speed up its open source Gemma 4 models.

| AI | Apr 27, 2026

New Chinese open models challenge closed Western top tier

DeepSeek V4 and Qwen3.6-27B, both out last week, put pressure on Western models – especially for pricing.

| AI | Apr 20, 2026

Inference budgets are overrunning by "orders of magnitude" - what now?

Goldman Sachs: "Industry is countering by redistributing value [via] open source distilled models and the focus on roadmaps weighted to proprietary SLMs..."

document.currentScript.parentNode.innerHTML = (parseInt(document.currentScript.closest('.iteration-container').dataset.length)).toString();