Claude's Cycles [pdf]
Overview of the Result
- The paper describes how a reasoning-focused language model, guided by a human collaborator, explored many programmatic approaches and eventually discovered an algorithm that solved an open combinatorial problem for all odd cases.
- The human then proved correctness and wrote up the formal math; the even case remains unsolved.
Was This Genuine Novelty?
- Some commenters assert the model must have simply regurgitated part of its training set; others counter that:
- The problem was presented as open in the literature.
- The successful approach emerged only after ~30 failed explorations.
- The model refined and reused earlier partial ideas, suggesting genuine search rather than memorization.
- Several note that if this were a known solution, it likely would have appeared immediately, not after a long iterative search.
What This Suggests About LLM Capabilities
- Many see this as strong evidence of nontrivial problem-solving: pattern search, hypothesis generation, code synthesis, and refinement under feedback.
- Others emphasize the human–model synergy: the person chose directions, restarted when outputs degraded, and translated the final algorithm into a proof.
- There is debate over whether this counts as “thinking” or simply “very powerful next-token prediction plus good tooling.”
Intelligence, Memory, and Learning
- Long back-and-forth on whether models that can’t update their weights at inference time are truly “intelligent,” with analogies to human amnesia and external memory tools.
- Some argue that adding tool use, external memory, and agents on top of a base model can approximate long-term learning; others insist this remains fundamentally different from self-updating cognition.
Keeping Models Up to Date
- Concern about models as “time capsules” with fixed knowledge cutoffs.
- Discussion of:
- Continual training vs. continual learning in-context.
- Huge context windows, compaction, and the “dumb zone” when too much prior detail is lost.
- Using user interactions and reasoning traces as future training data, with attendant privacy and consent worries.
Broader Implications and Skepticism
- Enthusiasts see this as an early sign that hard open problems (including in physics or pure math) might fall to similar approaches.
- Skeptics stress current systems still make silly errors, struggle with many novel problems, and rely heavily on human steering.
- Ethical concerns arise around surveillance, concentration of power, and the future role of human cognitive labor.