2025-12-02

100k TPS over a billion rows: the unreasonable effectiveness of SQLite

Vertical scaling and hardware choices

Many argue the article’s results are only applicable when all data and compute fit on a single machine, but note that modern “big box” servers (24TB RAM, large NVMe) provide huge headroom.
Several prefer cheap bare-metal (e.g., Hetzner) over AWS for single-node performance and cost, but others complain about Hetzner’s KYC/bureaucracy and spotty onboarding, especially outside the EU.
Some highlight that for stable workloads, vertical overprovisioning is often cheaper and simpler than complex distributed setups, especially when engineering headcount is considered.
Others point out vertical scaling has poor elasticity for spiky workloads (e.g., Black Friday); scale-out still matters there.

Network latency vs embedded databases

A core discussion theme is that network latency and Amdahl’s law can dominate throughput for “interactive transactions” with multiple round trips and application logic in between.
Many endorse the article’s framing: reconsider whether the database needs to be remote at all; local/embedded DBs can beat “better” remote ones by orders of magnitude.
Some push back that the article mixes configurations (remote Postgres vs embedded SQLite) and that similar gains might be possible with a local Postgres tuned appropriately or with stored procedures/triggers.

Benchmark methodology and fairness

Commenters question:
- Using SERIALIZABLE isolation for Postgres where it may be unnecessary.
- Assuming 5–10ms network latency, which some consider unrealistic for colocated servers.
- Using small Postgres connection pools; others note larger pools worsened contention in this particular workload.
- Using synchronous=NORMAL for SQLite, which relaxes durability; the post was later updated with FULL numbers, narrowing but not erasing SQLite’s advantage.

Concurrency, WAL, and reliability

Several share strong positive experiences with SQLite performance (WAL mode, mmap, batching, SAVEPOINT, streaming backups).
Others report pain points: database locks, WAL not checkpointing and growing without bound, severe slowdowns, and difficulties on cloud “local” disks.
Discussion of WAL corruption threads concludes: if disk/FS/ RAM are sound, SQLite is generally safe, but it doesn’t protect against underlying hardware issues; questions remain about recovery severity and checksum/merkle-based replication schemes.
Recommended patterns include:
- Single writer connection behind an MPSC queue, multiple read-only connections.
- WAL mode, careful checkpointing (possibly via litestream), and avoiding shared/network filesystems.

High availability, replication, and scale-out

SQLite’s main limitation repeatedly cited: it scales up, not out; no built-in clustering, multi-writer, or transparent failover.
For HA/replication, commenters mention litestream, LiteFS, rqlite, dqlite, rsync-based replication, and emerging projects like Marmot.
Event sourcing + projections (multiple SQLite DBs fed from an append-only log) is proposed as a way to get zero-downtime migrations and sharded scaling, but acknowledged as a significant architectural shift.
Some note SQLite is unsuitable where strict RPO=0 or enterprise-grade HA is required; traditional RDBMSs are still preferred there.

Real-world use and when to choose SQLite vs Postgres

Links and anecdotes mention SQLite at very high QPS on single servers, vector search (sqlite-vec), content stores, and small self-hosted web apps.
Many see SQLite + vertical scaling as ideal for simple, single-node, or “local-first” systems, especially when some downtime is acceptable.
Others argue that for general multi-tenant, networked business apps with strong HA, rich types, and multiple writers, Postgres or other server DBs remain the safer default.
There is also backlash against recurring “SQLite worship” threads, with some saying they underplay operational complexity and niche fit.

Related topics