2026-03-03

Claude's Cycles [pdf]

Overview of the Result

The paper describes how a reasoning-focused language model, guided by a human collaborator, explored many programmatic approaches and eventually discovered an algorithm that solved an open combinatorial problem for all odd cases.
The human then proved correctness and wrote up the formal math; the even case remains unsolved.

Was This Genuine Novelty?

Some commenters assert the model must have simply regurgitated part of its training set; others counter that:
- The problem was presented as open in the literature.
- The successful approach emerged only after ~30 failed explorations.
- The model refined and reused earlier partial ideas, suggesting genuine search rather than memorization.
Several note that if this were a known solution, it likely would have appeared immediately, not after a long iterative search.

What This Suggests About LLM Capabilities

Many see this as strong evidence of nontrivial problem-solving: pattern search, hypothesis generation, code synthesis, and refinement under feedback.
Others emphasize the human–model synergy: the person chose directions, restarted when outputs degraded, and translated the final algorithm into a proof.
There is debate over whether this counts as “thinking” or simply “very powerful next-token prediction plus good tooling.”

Intelligence, Memory, and Learning

Long back-and-forth on whether models that can’t update their weights at inference time are truly “intelligent,” with analogies to human amnesia and external memory tools.
Some argue that adding tool use, external memory, and agents on top of a base model can approximate long-term learning; others insist this remains fundamentally different from self-updating cognition.

Keeping Models Up to Date

Concern about models as “time capsules” with fixed knowledge cutoffs.
Discussion of:
- Continual training vs. continual learning in-context.
- Huge context windows, compaction, and the “dumb zone” when too much prior detail is lost.
- Using user interactions and reasoning traces as future training data, with attendant privacy and consent worries.

Broader Implications and Skepticism

Enthusiasts see this as an early sign that hard open problems (including in physics or pure math) might fall to similar approaches.
Skeptics stress current systems still make silly errors, struggle with many novel problems, and rely heavily on human steering.
Ethical concerns arise around surveillance, concentration of power, and the future role of human cognitive labor.