AI predictions for 2024: From SLMs to new architectures

You may have missed it, but AI was rather a talking point in 2023. CEOs urgently demanded AI strategies, everyone became an expert in prompt engineering and it was unexpectedly sexy, suddenly, to turn up in RAGs. (Retrieval Augmented Generation: An architectural approach that pulls your data as context for LLMs to improve relevancy, but you knew that!)

Looking at AI predictions for 2024, some key themes emerge. Veeam’s CTO Danny Allan was done with the noise – saying in an emailed comment to The Stack that “we’ve seen the hype around blockchain and Web3 in years past, and generative AI is no different… we can’t expect any groundbreaking use cases and broader adoption for quite some time.”

As views go, this was refreshingly contrary, but definitely an outlier.

Others were more upbeat.

Here are some key AI predictions for 2024.

AI predictions for 2024: Enter “a standard open LLM”

Dr Andriy Burkov, author of two books on AI and with a PhD on complex multi-agent decision problems, predicted: “By the end of 2024 we will have a standard open LLM, an equivalent of BERT or RoBERTa in the world of LLMs. This standard model will be trained on a dataset free of legal issues (e.g. open domain books, Wikipedia, and other documents released under the Creative Commons license) and everyone will use it for fine tuning in the business context. The size of this standard model will be below 7B parameters and its performance will be equivalent to GPT-3.5.”

He added on LinkedIn: “No serious user-facing product will display GPT-4-generated output given its legal issues that will continue and become even more serious throughout 2024 [and] new architectures competing with Transformer, such as Mamba, will appear. They will be more cost-efficient. GPU price will also drop. As a consequence, the cost of LLM inference will drop by a factor of 5 to 10 by the end of 2024.”

Denis V. Batalov, Worldwide Technical Leader, AI, AWS, said his AI predictions for 2024 included more domain-specific models: “Imagine models that are designed for accounting, economics, jurisprudence, manufacturing, construction, fisheries... Different domains are aplenty and the only question is whether there is enough training data from that specific domain to make it feasible… I would not be surprised if [an] entire industry will emerge around offering curated and current training data.

He added: “What if the transformer-like architectures were applied to much more different types of data and use cases. What if we train a model on software logs? Wouldn't this be helpful for detecting anomalies? How about models that are trained on various quantized metrics? It would not be hard to get exabytes of such data collected and patterns extracted from such data would be highly valuable. Predictive maintenance, weather forecasts, sales anomalies... Sky is the limit.”

Batalov acknowledged that hallucinations (“there practically by design - the models are probabilistic in nature”) will continue to be an issue: “More explicit fact based logical reasoning capabilities are required. Turns out this is really hard! While I expect progress here in 2024, I am actually skeptical that this "problem" can be outright solved in the next year.”

See also: The Big Hallucination

Martin Signoux Public Policy Manager at Meta, played up the potential for AI-augmented smart glasses (Meta has skin in that game) whilst also saying “LLMs will remain intrinsically limited and prone to hallucinations.”

(Educator and no-code operator Vensy Krishna agreed: "1. AI smart glasses will be mainstream. 2. More open-source AI models will be released. 3. A lot more applications in healthcare (affordability, quality). 4. The AI video space will boom, and we'll see full-fledged movies and short films. 5. AI voice-first apps with more natural-interacting voices will be at the forefront. 6. More AI dependency in day-to-day usage like fashion, e-commerce, and 7. Humanoids like Tesla's Optimus will be seen in real-world applications...)

One of Signoux's key AI predictions for 2024 is that “cost-efficiency and sustainability considerations” will accelerate a trend towards “Small Language Models” (SLMs rather than LLMs), adding that “quantization will also greatly improve, driving a major wave of on-device integration for consumer services.” (Quantisation refers to a set of techniques to convert input values from a large set to output values in a smaller set…)

He also predicted that an “open model beats GPT-4… We’re ending 2023 with only 13% left between Mixtral and GPT-4 on MMLU. Open models are here to stay and drive progress, everybody realised that. They will coexist with proprietary ones, no matter what OS detractors do.

But benchmarking “remains a conundrum. No set of benchmarks, leaderboard or evaluation tools emerge as THE one-stop-shop for model evaluation. Instead, we’ll see a flurry of improvements (like HELM recently) and new initiatives (like GAIA), especially on multimodality.”