How to store a chess position in 26 bytes (2022)
Information-theoretic bounds and what’s being encoded
- Several comments note that 26 bytes (208 bits) is well above the theoretical minimum needed to represent all legal positions (including side to move, castling, en passant), which is estimated around 152–153 bits.
- Others point out that if you only encode positions that actually occur in a specific database (“observed positions”), you can compress far more aggressively, even down to just indexing game+ply.
- There’s repeated clarification that counting “board states” is different from “game difficulty”; a huge state space doesn’t necessarily imply a hard game (example: Nim).
Clever vs practical encodings
- One camp praises the article as a fun, clever hack in the traditional “hacker” sense, useful as an exercise in bit-level thinking.
- Another camp argues that such dense encodings are rarely worth the complexity: bytes are cheap, code clarity and CPU cost of packing/unpacking are not.
- Some suggest the right scheme depends heavily on use: storing billions of positions vs querying a database vs transmitting games.
Chess engines, transposition tables, and real-world practice
- Engine-focused comments stress that top engines typically don’t store entire board states in huge tables; they use transposition tables keyed by Zobrist hashes and keep only one full board per search thread.
- Bitboards and related tricks are cited as “standard” for fast move generation rather than ultra-compact storage.
Game state vs board state and rules completeness
- Multiple commenters distinguish between:
- Board position (pieces, side to move, castling, en passant)
- Full game state (50/75-move counter, repetition history, clocks, etc.)
- The article’s scheme is criticized for not capturing repetition and 50-move information, which are required to decide claims of a draw.
- FEN is mentioned as a reference that explicitly includes move counters.
Castling, en passant, and logical concerns
- A central thread debates the article’s trick of overloading piece locations (e.g., using the king’s square as a special code for unmoved rooks/pawns to infer castling or en passant).
- Critics highlight edge cases: rooks or pawns that moved and returned, en passant availability after complex histories, and the difficulty of distinguishing “never moved” from “moved back” purely from piece locations.
- Some propose adding at least one extra bit to disambiguate rook-moved/en-passant-available, while others describe more intricate swap/encoding schemes to avoid extra bits.
Promotion, bishops, and bit-saving tricks
- Commenters note straightforward savings: bishops only ever occupy one color, so they can use 5 bits instead of 6 per bishop; this alone can shave several bits.
- There is extended back-and-forth about how many extra bits promotions truly cost in the worst case, with people constructing upper bounds and arguing about pawn-capture constraints.
Alternative encoding schemes (often ≈24–26 bytes)
- Multiple alternative fixed-length encodings are sketched:
- Occupancy bitboard (64 bits) plus piece codes for occupied squares (e.g., 4 bits each), reaching 24 bytes worst case.
- Masks of which of the 32 nominal pieces exist, plus 6-bit coordinates per surviving piece, with careful handling of cases where many pieces are missing to keep under 26 bytes.
- Schemes that use compact 2-bit prefixes and suffixes to encode up to 12 piece types per square.
- Lichess’s production format is cited: 64-bit occupancy, then variable-length piece and state data, efficient on average and able to handle Chess960.
Moves vs positions; whole games vs single states
- Some argue that whole games are best stored as move lists, not position lists, and point to work compressing moves to ~3.7 bits each.
- Another commenter suggests directly encoding game histories as sequences of legal moves (using a move generator to index moves), potentially beating 26 bytes for typical full games but not as a universal worst-case position encoding.
“Bit-level magic” and implementation details
- One commenter objects that the article’s examples use decimal/character strings and integers, not literal bitstreams, and accuses it of misusing the term “bit-level”.
- Others respond that the examples are conceptual; the key is the count of possibilities (e.g., 495 promotion patterns fitting in 9 bits), regardless of how the article illustrates them.
- There is some confusion between theoretical bit counts and actual memory layout (bytes, padding, alignment), with clarification that a 9‑bit field will occupy 2 bytes in most practical layouts, but still only carries 9 bits of information.