2025-07-24

PSA: SQLite WAL checksums fail silently and may lose data

Purpose of SQLite WAL checksums

Several commenters argue WAL checksums are primarily to detect partial / out‑of‑order writes, not arbitrary corruption or “bit rot.”
Checksums let SQLite know when a transaction was fully written and fsynced so it can safely treat it as committed and distinguish complete vs incomplete journal entries.
SQLite uses checksums on both rollback journal and WAL; database file pages themselves are not checksummed by default (except via optional VFS extensions).

Why truncating at first bad frame is seen as correct

Once a checksum fails, all subsequent frames are untrustworthy, because:
- The WAL describes page‑level B‑tree changes; skipping frames can break structural invariants and cause deeper corruption.
- Later frames may depend on earlier ones (e.g., page splits, pointer rewrites), so applying them without the missing changes is unsafe.
Chaining checksums means corruption in the middle is treated like a “rockslide”: replay up to the obstruction, then stop; this leaves the DB in a state that definitely existed.
Many see this not as “data loss,” but as rolling back untrustworthy transactions to the last known‑good state.

Disagreement over “silent” behavior and recovery options

The article’s author and some commenters want:
- Errors raised on checksum failure.
- An option to preserve the WAL for manual / tool‑driven recovery or partial salvage in special cases (e.g., corrupted frames that aren’t logically needed).
Others counter that:
- Most applications just want a consistent DB that opens; refusing to start on WAL issues would be worse.
- Attempts at partial application or heuristic recovery would be arbitrary and dangerous for transactional semantics.

Checksums, algorithms, and corruption scope

Debate over Fletcher vs CRC:
- CRCs can sometimes correct 1–2 bit errors and have stronger detection; Fletcher is seen as weaker and somewhat outdated.
- Counter‑argument: for SQLite’s purpose (detecting partial writes), Fletcher is sufficient; full anti‑bit‑rot protection would also require page checksums on the main DB.
Some note real corruption can occur in transit (controller, bus, memory) despite SATA/NVMe ECC, though rates and seriousness are debated.

Filesystem and storage assumptions

SQLite assumes reasonably correct fsync semantics; if fsync “lies” or the filesystem silently corrupts data, SQLite cannot fully defend against that.
Discussion of ZFS, ext4, btrfs:
- ZFS’s strong checksumming is praised, but its interaction with SQLite fsync and txg timeouts can cause hangs unless tuned (e.g., sync=disabled), which then risks data loss.
- Many databases and common filesystems still use relatively weak 16–32‑bit checksums, often disabled or metadata‑only.

Critiques of the article’s framing

Several commenters think calling this a “PSA” about data loss is misleading:
- The behavior is intentional, documented, and primarily about crash safety, not end‑to‑end corruption detection.
- The simulated WAL‑only corruption scenario is considered contrived compared to broader risks (main DB corruption, full sector loss, missing .db‑shm, etc.).
Some also object to implying SQLite is hard to contribute to or “limping along,” noting that corruption‑detection features (checksum VFS) and clear design assumptions already exist.

Related topics