OpenAI researcher announced GPT-5 math breakthrough that never happened

What Actually Happened with the “GPT-5 Math Breakthrough”

  • GPT‑5 was used to query a community Erdos problem database; it surfaced existing published solutions to problems still marked “open” there.
  • The original researchers framed this as “superhuman literature search.”
  • A senior OpenAI exec then amplified it as “GPT‑5 just found solutions to 10 previously unsolved Erdos problems,” which many read as “novel solutions to unsolved problems.”
  • Mathematicians pointed out the problems had been solved years earlier and that the site’s “open” status only reflected the maintainer’s knowledge lag, not actual unsolved status.
  • The OpenAI exec later retracted, calling it a misunderstanding; some commenters see this as an honest mistake, others as part of a pattern of overclaiming.

Hype, Trust, and OpenAI Culture

  • Many argue this incident illustrates an institutional bias toward sensational claims (“science revolution,” “AGI achieved internally”), weak internal verification, and marketing-driven communication.
  • Others say the pile-on is disproportionate to the actual error and driven by generalized anti‑OpenAI sentiment.
  • Several note similar miscrediting episodes at other labs (e.g., AI “discovering” math or algorithms that already exist in the literature).

Hallucinations, Human Error, and Responsibility

  • Thread plays on “humans hallucinating about AI”: people at OpenAI believing their own hype and misreading ambiguous tweets.
  • Debate over whether this is best seen as hallucination, negligence, or lying; Hanlon’s razor is invoked, but corporate incentives (“salary depends on not understanding”) are emphasized.
  • Many stress that extraordinary mathematical claims should face extraordinary internal scrutiny before going public.

What LLMs Are Actually Good At in Math & Research

  • Strong consensus that LLMs are currently poor at genuinely novel math or complex reasoning without heavy tool support.
  • Some describe GPT‑5‑style models as excellent semantic search / literature assistants:
    • Good at surfacing obscure or cross‑field papers and building reading lists.
    • Bad at reliably summarizing or evaluating the literature; hallucinated citations remain common.
  • Others say even as search helpers they’re “highly convincing counterfeits” and too error‑prone for serious work, especially with older or niche technical material.
  • Several suggest the real frontier value is better semantic search and citation graph tooling, not “AI solves open problems.”

Broader Reflections: AGI, Bubble, and Pivot to Slop

  • Many see this as one more data point that we are far from AGI and that LLM “reasoning” progress has slowed; claims of near‑term super‑intelligence are seen as hype.
  • Some fear an AI investment bubble whose collapse could damage broader tech and even the economy; others think impact would be closer to a contained sector correction.
  • OpenAI’s recent pivots to ads, in‑chat commerce, and adult content are read by some as evidence of “enshittification” and desperation for monetization rather than deep research seriousness.