Zero-latency SQLite storage in every Durable Object
Use cases and target workloads
- Strong interest in DOs for realtime, multiplayer-style apps (docs/design tools, games, collaborative UIs). Single logical “document/room” maps cleanly to one DO + SQLite.
- Suggested also for low-traffic internal tools or per-tenant isolation, where a full managed Postgres instance feels heavy or too costly.
- Skeptics argue most production systems should stick with “boring” Postgres/VMs until needs are extreme, due to maturity and edge-case risk.
Durability, latency, and consistency
- Each DO has its own local SQLite DB. All operations on that DB are routed to the single DO instance worldwide, so it always has a consistent view.
- Writes are committed locally, then synchronously replicated to 5 nearby replicas; the write is acknowledged after 3 confirm.
- WAL chunks are also streamed to object storage every ~16MB or 10 seconds for backup/rollback. Some posters worry this implies up to ~10s of potential data loss on crash; others clarify that object storage is for backups, not primary reads.
- Within a DO, reads see writes immediately. There is no cross-region “read after write” issue because you cannot bypass the DO to read the DB directly.
Location, scaling, and hot partitions
- DOs are created in a region (optionally hinted) and currently do not automatically relocate, though future dynamic relocation is planned.
- Only a subset of Cloudflare PoPs host DOs; other PoPs forward traffic.
- Single-partition hotspots are called out as a concern; counterpoint is that SQLite can handle very high write rates for many workloads, and reads can be offloaded via caching.
Programming model, features, and limits
- DOs are described as an actor-like model with global routing and optional RPC-style calls between objects or workers.
- Sync-looking write API actually defers error reporting until response, enabling automatic batching.
- Noted limits: 128MB RAM per runtime, no built-in read transactions/snapshots, tricky long-lived cursors due to WAL growth, and hibernating WebSockets.
Cost, lock-in, and operational concerns
- Pricing and hibernation behavior make some developers nervous; lack of strong spending caps is seen as risky for small teams.
- Heavy vendor lock-in worries many; rebuilding elsewhere would be nontrivial, though some projects try to offer portable abstractions.
- Debuggability, observability, and handling slow DOs or failures at scale are flagged as open concerns.
Data modeling, analytics, and migrations
- Per-document/tenant SQLite is attractive for localized state, but makes global queries (e.g., “all full flights”, analytics across all docs) harder; likely requires a separate analytics system.
- Schema migrations across many DOs are nontrivial; suggested pattern is running per-DO migrations on initialization.
- One poster dislikes “many tiny DBs” from a relational perspective; others note it fits document-like domains better than giant global tables.
Comparisons to traditional databases
- Several argue: start with Postgres for most apps; only move to specialized systems (ClickHouse, DOs, etc.) once scale or latency demands justify complexity.
- Others view colocating compute + SQLite as a real complexity reduction for certain classes of apps, not just “shiny tech.”
Unclear / open questions
- How data residency and regulatory requirements are satisfied is raised but not answered.
- Low-level implementation details (e.g., exact VFS/WAL integration) remain unexplained in the thread.