GPT-5.2 derives a new result in theoretical physics
What GPT-5.2 Actually Contributed
- Humans framed a specific scattering-amplitude problem, computed low‑n base cases with very complicated expressions, and suspected a simpler closed form.
- GPT‑5.2 (in an internal “scaffolded” setup) spent ~12 hours simplifying those expressions, spotting a simple pattern, conjecturing a formula valid for all n, and producing a formal proof.
- Human physicists then checked the result and extended it into a full paper; GPT did not autonomously choose the problem or write the paper.
Novelty, Validity, and Literature Concerns
- Several commenters stress this is a preprint: theoretical-physics results often later get weakened, corrected, or quietly superseded.
- Some worry it may just repackage known structures (e.g. Parke–Taylor / MHV work) rather than produce something fundamentally new, though the authors explicitly cite that literature.
- There is broader context of earlier “AI solved Erdős problems” claims where some “novel” solutions turned out to be already in the literature or minor variants.
- One physicist reading the paper finds the key generalized formula almost obvious once the n≤6 expressions are simplified, and suggests a CAS could plausibly have done the same.
Tool vs Collaborator: How to Attribute Credit
- Strong dispute over whether this is like “a calculator helped” or “a genuine co‑author.”
- Some argue GPT only refactored a pattern that humans then verified, so the headline overstates things.
- Others say an agent that autonomously runs for hours, reorganizes the calculation, conjectures, and proves something the humans had failed to find merits serious research credit; hence an institutional OpenAI authorship.
Capabilities, Limits, and “New Ideas”
- Many see this as exactly the sweet spot for LLMs: verifiable domains with test suites or formal checkers, where brute‑force structured exploration is valuable.
- Skeptics argue that so far LLMs mainly recombine existing ideas “in distribution” rather than producing paradigm‑shifting insights; defenders reply that most human advances are also recombinations.
- Discussion spills into whether anything humans do is more than refined brute‑force search, and whether current models yet show evidence of genuine out‑of‑distribution creativity.
Scaffolding, Long Runs, and Engineering Details
- Curiosity about how a 12‑hour run was orchestrated: likely multiple rounds of reasoning with context compaction (summarizing prior work into new prompts), possibly parallel branches and verification loops.
- Some users note current public “thinking” modes cut off around 30–60 minutes and require manual restarts; they want access to similar long‑horizon setups.
Perceived Significance for Physics
- Domain commenters describe the result as a nontrivial but quite specialized simplification/generalization within an already well‑developed amplitudes program, not a headline‑level revolution.
- Several emphasize that the hardest parts of physics are often: choosing good questions, connecting to experiment, and spotting which abstruse results actually matter—tasks where LLMs are still unproven.
Hype, Marketing, and Societal Reactions
- Many see the blog post as a carefully timed marketing piece (especially with an OpenAI employee on the author list), paralleling earlier overhyped AI “breakthroughs.”
- Others push back on the growing instinct to dismiss every AI-assisted result, noting that comparable human‑only achievements would be uncontroversially respected.
- There is extensive meta‑discussion about “moving the goalposts,” job anxiety, and the way AI success stories are being used in narratives about replacing knowledge workers.