2026-02-16

Building SQLite with a small swarm

Test Coverage, Correctness, and “Did It Work?”

Multiple commenters ask whether the implementation passed SQLite’s official test suite; it did not.
The project’s tests against SQLite as an “oracle” are minimal (a few simple SELECTs), far from SQLite’s tens of thousands/millions of cases.
Lack of rigorous testing makes claims like “implemented most SQLite operations” unreliable; even the author later acknowledges over‑trusting the model’s self‑report.

Code Quality vs SQLite

Reviewers who inspected the code describe it as basic and incomplete: no concurrency, linear free-list search, TODOs for critical behaviors (e.g., freeing overflow pages), naive buffer cloning, and a very limited query planner.
It’s seen as potentially “basically working” for simple embedded use, but nowhere close to SQLite’s robustness, performance, or engineering standards.
SQLite’s huge, public test suite and additional proprietary TH3 tests are repeatedly cited as the benchmark for quality.

Rust, Memory Safety, and SQLite Security

One thread suggests a Rust, unsafe‑free implementation might avoid memory corruption vulnerabilities, even if it “eats your data.”
Others push back, arguing SQLite’s CVEs are often overblown and that the project’s own security statements can feel dismissive or arrogant.
Debate arises over whether SQLite’s C + exhaustive testing can be strictly “less safe” than a young Rust reimplementation.

Value, Naming, and “Simulacra”

Strong criticism of calling this “building SQLite” when it fails the test suite; several prefer framing it as “wrote an embedded database.”
Some argue these projects are mostly demos or props—“simulacra” of complex systems—useful for hype, not production.
Others see genuine value in proving agents can approximate complex architectures from tests, or in the idea of clean‑room reimplementations.

Agents, Orchestration, and Validation

The author frames the project as an experiment in multi‑agent orchestration (six heterogeneous models) rather than a viable DB.
Commenters highlight validation as the real bottleneck; more agents and parallelism mostly create coordination overhead and messy code.
There’s skepticism that agents can “iron out bugs” without introducing others, even with test suites.

Meta: Novelty, Licensing, and Practical Use

Several point out that re‑creating existing OSS with LLMs is essentially “laundering” public code and offers little novelty.
Others respond that most real‑world software is pattern‑rehash anyway, so brute‑forcing similar systems can still be economically valuable.
Some call for more ambitious or genuinely new targets (e.g., “Wine for macOS apps”) rather than weaker clones of existing tools.

Related topics