PSA: SQLite WAL checksums fail silently and may lose data

Purpose of SQLite WAL checksums

  • Several commenters argue WAL checksums are primarily to detect partial / out‑of‑order writes, not arbitrary corruption or “bit rot.”
  • Checksums let SQLite know when a transaction was fully written and fsynced so it can safely treat it as committed and distinguish complete vs incomplete journal entries.
  • SQLite uses checksums on both rollback journal and WAL; database file pages themselves are not checksummed by default (except via optional VFS extensions).

Why truncating at first bad frame is seen as correct

  • Once a checksum fails, all subsequent frames are untrustworthy, because:
    • The WAL describes page‑level B‑tree changes; skipping frames can break structural invariants and cause deeper corruption.
    • Later frames may depend on earlier ones (e.g., page splits, pointer rewrites), so applying them without the missing changes is unsafe.
  • Chaining checksums means corruption in the middle is treated like a “rockslide”: replay up to the obstruction, then stop; this leaves the DB in a state that definitely existed.
  • Many see this not as “data loss,” but as rolling back untrustworthy transactions to the last known‑good state.

Disagreement over “silent” behavior and recovery options

  • The article’s author and some commenters want:
    • Errors raised on checksum failure.
    • An option to preserve the WAL for manual / tool‑driven recovery or partial salvage in special cases (e.g., corrupted frames that aren’t logically needed).
  • Others counter that:
    • Most applications just want a consistent DB that opens; refusing to start on WAL issues would be worse.
    • Attempts at partial application or heuristic recovery would be arbitrary and dangerous for transactional semantics.

Checksums, algorithms, and corruption scope

  • Debate over Fletcher vs CRC:
    • CRCs can sometimes correct 1–2 bit errors and have stronger detection; Fletcher is seen as weaker and somewhat outdated.
    • Counter‑argument: for SQLite’s purpose (detecting partial writes), Fletcher is sufficient; full anti‑bit‑rot protection would also require page checksums on the main DB.
  • Some note real corruption can occur in transit (controller, bus, memory) despite SATA/NVMe ECC, though rates and seriousness are debated.

Filesystem and storage assumptions

  • SQLite assumes reasonably correct fsync semantics; if fsync “lies” or the filesystem silently corrupts data, SQLite cannot fully defend against that.
  • Discussion of ZFS, ext4, btrfs:
    • ZFS’s strong checksumming is praised, but its interaction with SQLite fsync and txg timeouts can cause hangs unless tuned (e.g., sync=disabled), which then risks data loss.
    • Many databases and common filesystems still use relatively weak 16–32‑bit checksums, often disabled or metadata‑only.

Critiques of the article’s framing

  • Several commenters think calling this a “PSA” about data loss is misleading:
    • The behavior is intentional, documented, and primarily about crash safety, not end‑to‑end corruption detection.
    • The simulated WAL‑only corruption scenario is considered contrived compared to broader risks (main DB corruption, full sector loss, missing .db‑shm, etc.).
  • Some also object to implying SQLite is hard to contribute to or “limping along,” noting that corruption‑detection features (checksum VFS) and clear design assumptions already exist.