Sam Altman Says Oops, He Accidentally Made the New Version of ChatGPT Worse Than the Last One

by
0 comments
Sam Altman Says Oops, He Accidentally Made the New Version of ChatGPT Worse Than the Last One

Illustration by Tag Hartman-Simkins/Futurism. Source: Andrew Harnik/Getty Images

It’s been a little over three years since the launch of OpenAI’s ChatGPT, the first commercially available large language model (LLM) chatbot. And although AI models have certainly improved in performance since coming online, the lackluster performance of recent iterations has not helped the perception that LLMs are hitting a plateau.

For example, OpenAI CEO Sam Altman recently admitted that the company has “screwed up” the language capabilities of its latest chatbot iteration, GPT-5.2.

“I think we messed it up,” Altman said. Developer Town Hall On Monday. “We will make a future version of GPT 5.x that will hopefully be even better at writing than 4.5.”

Continuing, Altman said that the company chose to focus on ChatGPIT’s technical capabilities, perhaps to the detriment of its human-language performance.

“We’ve decided, and I think for good reason, to put most of our effort into making 5.2 very good at intelligence, reasoning, coding, engineering, things like that,” Altman said. “And we have limited bandwidth here, and sometimes we focus on one thing and ignore another.”

The entry raises a high-level question: Can frontier AI models continue to excel at tasks across the board, or will proficiency in one domain begin to come at the expense of broader skill sets.

As search engine journal tellsThe release of GPT-5.2 came with a heavy emphasis on technical tasks such as coding and Formatting spreadsheets. Compared to previous iterations, there was little mention of any writing or creative work, a turn that left many non-technical users feeling as if ChatGPT was hitting a wall.

Mehul Gupta as data scientist and tech blogger stated in a review In GPT-5.2, there are a lot of signs that LLM is lagging, and some of them are not particularly subtle.

These include a “flattering tone”, poor translatability, inconsistent behavior across tasks, and some major regressions in “immediate mode”, a setting intended to provide immediate answers to simple questions.

As Gupta writes, it also grapples with real-world tasks. When it comes to evaluating human documents like contracts, mixed-format notes, or PDFs, GPT-5.2 “forgot earlier details, contradicted itself, misread cross-references, (and) confused explanations that didn’t exist.”

“The criteria are clean,” Gupta said. “Don’t have real docs. 5.2 is still struggling with the noise of reality.”

More information on ChatGPT: Scientists horrified as ChatGPT deletes all their research

Related Articles

Leave a Comment