Updated August 12 12:28 BST with details on capacity allocation
OpenAI has restored the option of “legacy models” after users reacted strongly to having GPT-5 imposed on them with tighter rate limits.
But CEO Sam Altman says the company will be forced to make “capacity tradeoffs” – suggesting on X that he may impose “rate limit increases” on all models, amid a bumpy launch of its much-hyped new reasoning model.
Updated August 12: Altman said that OpenAI is "prioritising compute" in the following way in light of "the increased demand from GTP-5".
1. We will first make sure that current paying ChatGPT users get more total usage than they did before GPT-5.
2. We will then prioritize API demand up to the currently allocated capacity and commitments we've made to customers. (For a rough sense, we can support about an additional ~30% new API growth from where we are today with this capacity.)
3. We will then increase the quality of the free tier of ChatGPT.
4. We will then prioritize new API demand. We are ~doubling our compute fleet over the next 5 months (!) so this situation should get better.
In a series of posts “Sama” said that “the percentage of users using reasoning models each day is significantly increasing “so rate limit increases are important” – a move that suggests sustained pressure on OpenAI’s infrastructure and comes amid mounting user frustration.
(AI use is demanding on infrastructure: ChatGPT crashed after its first launch in 2022, whilst in 2023 a then-lead engineers recalled discovering “that bottlenecks can arise from everywhere: memory bandwidth, network bandwidth between GPUs, between nodes, and other areas. Furthermore, the location of those bottlenecks will change dramatically based on the model size, architecture, and usage patterns... GPU RAM is actually one of our most valuable commodities. It's frequently the bottleneck, not necessarily compute.")
“Not everyone will like whatever tradeoffs we end up with, obviously, but at least we will explain how we are making decisions,” he added.
GPT-5’s launch on August 7 has been met with much criticism as some users lamented smaller performance upgrades then had been expected and said changes to its responses had made it seem more “clinical.”
See also: GPT-5 drops (some) API costs, goes head-to-head with Gemini on price
AI expert, and OpenAI sceptic, Gary Marcus said GPT-5 was a “major letdown” as it was “just not that different from anything that came before,” warning the launch may have damaged the company's brand.
However, Altman attempted to assuage concerns by explaining an autoswitcher error had made GPT-5 appear “way dumber” at first and saying UI changes would be made to easily “trigger thinking”.
On performance issues, he added capacity had been strained as API traffic doubled over the first 24 hours of GPT-5's launch and said the number of users using reasoning model, require more compute, was "significantly increasing."
Altman added that ChatGPT Plus users would regain access to 4o after users expressed “very different opinions on the relative strength of GPT-4o vs GPT-5” – confessing to have underestimated user allegiance to the older model and adding that OpenAI would monitor usage data to decide how long to offer its earlier “legacy models” going forward.
The push for 4o’s revival had also shone a light on a difficult fringe of AI users, though, with many users calling for its reactivation after missing personal ‘relationships’ developed with the model.
Altman said he felt “uneasy” about the dependence some have on ChatGPT for important life decisions and “subtle” cases where AI could reinforce delusion for users “in a mentally fragile state.”
He said OpenAI had been closely tracking the issue for a year but admitted the company, and society, still need to “figure out” how to make therapy and romantic-like AI relationships “a big net positive."