2025-03-06

Cognitive Behaviors That Enable Self-Improving Reasoners

AI techniques vs. human learning

One line of discussion asks whether advances in AI training have translated into better methods for training humans to think; several commenters say “not much yet.”
Others flip the question: why don’t AI training methods draw more from established research on human learning and education? Some argue early AI tried this and it didn’t pan out.

Thinking aloud, rubber ducking, and LLM-inspired reasoning

Multiple people report gains from mimicking “reasoning models” (e.g., verbose chain-of-thought) when studying or programming: speaking or writing out each step surfaces errors and extends working memory.
Many point out this is essentially an old technique: think‑aloud protocols, writing, rubber duck debugging, tutorials, debating, and “talk through your solution” interviews.
One critic argues this has nothing to do with AI; others reply that LLM outputs provide a large corpus of explicit, high-quality reasoning patterns humans can copy, which is new in scale if not in principle.

Memory, externalization, and cognitive offloading

Long historical and spiritual quotes about writing and tools “weakening memory” are contrasted with sayings that praise written records over memory.
Some argue it’s good to forget certain things and offload them to writing; others defend oral traditions and worry that externalization (now including AI) erodes important capacities.
There’s concern that heavy AI use encourages cognitive disengagement, increases some error rates in bureaucratic settings, and that studies already suggest degraded reasoning and choice-making when people over-rely on AI.

Internal monologue and diversity of thought

A long subthread explores internal monologues vs. non-verbal thinking: some have constant inner speech; others report no accessible monologue and think in images, spatial “registers,” or abstract “raw thought.”
Concrete math examples are used to probe how multi-step reasoning works without inner speech; answers emphasize direct manipulation of numeric concepts or visual representations.
Several suggest inner speech is just a serial “interface layer,” not the core of reasoning—raising questions about how token-based LLMs compare to human cognition.

Self-improving models and opaque internal languages

One commenter worries that self-improving or multi-agent systems might converge on an internal, human-unreadable “babble” language while still solving tasks, making oversight difficult.
Others counter that models are constrained by their training data, but acknowledge the idea of a dense internal “Neuralese” and a potential tradeoff between transparency and capability.

Skepticism about the paper and practicalities

Some see the paper’s four “cognitive behaviors” (verification, backtracking, subgoals, backward chaining) as just facets of one generic problem-solving algorithm humans already practice.
Others question the empirical basis for claims about expert human strategies and highlight the need to replicate striking results like models learning from incorrect-but-well-reasoned solutions.
There is doubt that better prompting alone can reliably induce these behaviors; models often ignore such instructions or exceed context with verbose reasoning.

Related topics