GPT-5’s modest gains suggest AI progress is slowing down
OpenAI’s latest large language model has achieved seemingly underwhelming improvements in performance, leading to questions about whether the AI industry can make significant advancements with its current designs
By Alex Wilkins
13 August 2025
GPT-5 is the latest version of OpenAI’s large language model
Cheng Xin/Getty Images
AI’s latest step forward isn’t so much a giant leap as a tentative shuffle. OpenAI has released its newest AI model, GPT-5, two years after rolling out GPT-4, whose success has driven ChatGPT towards world domination. But despite promises of a similar jump in capability, GPT-5 appears to show little improvement over other leading AI models, hinting that the industry may need a fresh approach to build more intelligent AI systems.
OpenAI’s own pronouncements hail GPT-5 as a “significant leap in intelligence” from the company’s previous models, showing apparent improvements in programming, mathematics, writing, health information and visual understanding. It also promises less frequent hallucinations, which is when an AI presents false information as true. On an internal benchmark measuring “performance on complex, economically valuable knowledge work”, OpenAI says GPT‑5 is “comparable to or better than experts in roughly half the cases… across tasks spanning over 40 occupations including law, logistics, sales, and engineering.”
Read more
The AI expert who says artificial general intelligence is nonsense
However, GPT-5’s performance on public benchmarks isn’t dramatically better than leading models from other AI companies, like Anthropic’s Claude or Google’s Gemini. It has improved on GPT-4, but the difference for many benchmarks is smaller than the leap from GPT-3 to GPT-4. Many ChatGPT customers have also been unimpressed, with examples of GPT-5 failing to answer seemingly simple queries receiving widespread attention on social media.
“A lot of people hoped that there would be a breakthrough, and it’s not a breakthrough,” says Mirella Lapata at the University of Edinburgh, UK. “It’s an upgrade, and it feels kind of incremental.”
The most comprehensive measures of GPT-5’s performance come from OpenAI itself, since only it has full access to the model. Few details about the internal benchmark have been made public, says Anna Rogers at the IT University of Copenhagen in Denmark. “Hence, it is not something that can be seriously discussed as a scientific claim.”