2024-12-16

In Search of a Faster SQLite

Benchmark results and practical relevance

Paper shows big improvements only at extreme tail latencies (p999+); p90–p99 similar to SQLite.
Workload (fetching 100 rows without filters/sorts) is seen as too trivial and I/O-light to be broadly meaningful.
Several commenters conclude the gains mainly matter in heavily contended, multitenant scenarios, not in single-node monoliths.

Serverless / edge patterns with SQLite

Multiple experiences using SQLite databases prebuilt and stored in S3 or container images for Lambda/“serverless” functions.
Pattern: periodic job parses source data → writes/indices SQLite → uploads to S3 or bakes into image; functions download or read local copy for very fast lookups.
Local /tmp and persistent global scope in Lambda are highlighted as useful caches.

Consistency, S3 scaling, and “thundering herd”

Clock-based freshness checks can be unsafe for strict consistency; Etags or explicit version files are suggested instead.
Some report Lambda-level throttling and S3 rate limits when many functions pull the same DB at once; approaches include bundling DB in images or staggering rollout.
Discussion of S3 “prefix” sharding, request limits, and the danger of hot date-based prefixes.

io_uring, async I/O, and safety

Question whether major cloud/edge providers enable io_uring due to past vulnerabilities and SELinux policy complexity; currently unclear and may vary by platform.
Async I/O mainly helps concurrency and thread count, not individual query latency, though batching via ring buffers can cut syscall overhead.
Concern about trading off simplicity and safety; others argue async I/O is overdue on Linux and can be robust (e.g., other databases using it).

Rewriting SQLite in Rust and project politics

Limbo (Rust rewrite) and libSQL raise debate:
- Pro: open-source, MIT-licensed, adding “cloud-capable” features, building in the open, targeting SQLite API/file compatibility and strong testing (Deterministic Simulation Testing + external tools).
- Con: seen by some as hype-driven, overlapping with SQLite’s domain without matching its test rigor yet, and as marketing itself as a “better SQLite” before parity.
There is criticism of how SQLite’s ethics/code-of-conduct framing is contrasted with the new projects’ code-of-conduct; others defend the new projects’ approach.

Testing, correctness, and TH3

SQLite’s extensive, partly proprietary test harness (TH3) is viewed as a high bar.
Suggestions: license TH3 or rely on open test suites plus DST.
Some doubt whether new testing strategies will match SQLite’s long-proven reliability; others are optimistic but note it “remains to be seen.”

SQLite governance and “open source”

Debate over whether SQLite is “open source but not open contribution”:
- It’s public domain and accepts only carefully vetted, public-domain-dedicated contributions, often rewritten.
- Some see lack of GitHub-style PRs as non-OSS in spirit; others argue open contribution is not required for user freedom and that this model is reasonable for such critical software.

SQL vs key–value stores

Several argue KV stores are awkward for relational/graph-like data and are better as internal building blocks; SQL offers better developer experience for table-like data.
Counterpoint: SQL makes nested/0NF structures (tables within tables, hierarchical comments) awkward; requires joins, CTEs, views, or JSON aggregation.
Others respond that relational modeling intentionally avoids binding to a single access pattern and that such transformations belong in queries/views.

Other databases and async efforts

Anecdotes that SQLite can outperform client/server RDBMS (e.g., Postgres, MSSQL) in single-machine setups due to lack of network and IPC overhead and effective page caching.
Pointer to past and ongoing work to add async I/O and pluggable storage managers in Postgres, suggesting a broader interest in async storage backends, not just for SQLite.

Related topics