AI Search: The Bitter-Er Lesson
What “search” means here
- Most commenters read “search” as classic AI tree search (minimax, MCTS, breadth/depth-first), not web search or RAG.
- For LLMs this would mean branching over candidate thoughts/solutions, evaluating them, pruning, then answering – akin to “pondering”.
Perceived promise of adding search to LLMs
- Could let models spend more compute on hard problems and less on easy ones.
- Might turn today’s “intuitive oracle” LLMs into explicit problem solvers that can revise and refine plans before replying.
- Some see this as a plausible path to much stronger systems or even “AI foom,” especially in domains with cheap, automated evaluation (games, theorem proving, fuzzing, some science tasks).
Limits: value functions and search spaces
- Strong objection: chess works because there is a well-defined state space and fast, good value function; real-world tasks and “AI research” do not.
- Value functions today are highly domain-specific; general ones are lacking and their feasibility is unclear.
- For broad domains (AI research, curing Alzheimer’s, “cure cancer”), the state space and transitions are themselves unclear.
Compute and practicality
- Tree search over token sequences is computationally enormous (branching factor in the tens of thousands at token level).
- Even coarse-grained idea-level branching could be very expensive; recent papers using search need drastically fewer rollouts than game AIs, suggesting cost pressure.
- Debate over train-time vs inference-time cost tradeoffs; 100–1000× inference cost may be unacceptable for many applications.
Alignment and superintelligence debates
- Some warn: anything that accelerates paths to superintelligence worsens alignment risks; article is criticized for ignoring “what to optimize for” and control.
- Others are skeptical that “superintelligence” is even a coherent or reachable concept, or see AGI as requiring multiple unknown breakthroughs and long timelines.
World models, generalization, and LLM limits
- Repeated concern that current LLMs lack robust world models and generalization; they remix text more than they reason.
- Without reliable internal models, search may just traverse a space of biased, sometimes false beliefs.
- Several argue we still need mechanisms to learn usable world models (e.g., from video, rich simulations, adjustable abstraction levels).
Symbolic vs statistical approaches
- Commenters note that classical search, planning, and theorem-proving already have near-optimal algorithms under known tradeoffs (soundness, completeness, efficiency).
- Some advocate hybrid neuro-symbolic systems where logic, simulators, or ontologies provide structure and evaluation, with LLMs as generators.