2024-10-17

Grandmaster-level chess without search

Core idea & method

Paper is seen as a knowledge-distillation result: distilling Stockfish’s complex search-based evaluation into a single transformer “intuition” function over board states.
Model: ~270M parameters, trained on ~10M games, with ~15B positions annotated by Stockfish 16 (action-values).
Input is FEN plus UCI moves via a custom tokenizer; no explicit search or domain heuristics at inference.

Performance & evaluation

Reported Lichess blitz rating vs humans is ~2895; against bots it is ~700 Elo lower.
It outperforms AlphaZero’s policy/value networks without MCTS and GPT‑3.5‑turbo‑instruct on puzzles/chess strength.
Some commenters argue the blitz human results are inflated: the bot never flags on time, humans do, and Stockfish fallback in tactical crises may convert lost/drawn positions to wins.
Requests for additional baselines: limited-depth Stockfish, positions that require deep search, “anti-bot” openings.

Relation to existing engines

Debate over whether this advances practical engine strength: top search-based engines (Stockfish NNUE, Leela) are said to remain clearly stronger under standard conditions.
Some note Leela-style transformer work predates this paper and achieves better strength with smaller models.
Others see value as a fast GPU-friendly evaluator that could be combined with search for efficient parallel exploration.

Human-like play & non-GM engines

Many participants want engines that play like humans at specific ratings, not just “weakened” super-engines.
Suggestions include:
- Sampling among near-best moves instead of strict best.
- Conditioning on Elo during training.
- Training to predict human moves (as in models like Maia).
- Modeling typical human blunders, attention focus, and style.
Several existing projects and ideas are mentioned, but consensus is that truly human-like adjustable engines remain an open problem.

Critiques & conceptual issues

Some argue the “without search” title is misleading because generating labels required massive Stockfish search; the model is essentially a compressed approximation of Stockfish’s search tree.
Others frame this as a standard and valuable form of knowledge distillation (“compiling search into a single forward pass”).
There is debate over whether custom tokenization and problem-specific setups reduce the generality or significance of the result.

Broader chess-AI context

Ongoing discussion about whether chess is “solved” (consensus: only small endgames are; full-game optimal play remains far beyond reach).
The work is taken as evidence that high-level play can, in principle, be approximated by a pure evaluation heuristic if trained at sufficient scale.

Related topics