Scaling PostgreSQL to power 800M ChatGPT users

PostgreSQL-at-scale architecture

  • Core setup: one PostgreSQL primary handling all writes plus ~50 read replicas; read-heavy traffic is offloaded, write-heavy shardable workloads are moved to Azure CosmosDB and other sharded systems.
  • New tables are no longer added to the main Postgres deployment; new features default to sharded systems.

Write scalability and MVCC limits

  • Discussion centers on MVCC causing write and read amplification, bloat, and autovacuum complexity under heavy write load.
  • Some point to LSM-tree systems (e.g., TiDB, RocksDB-style designs) as better suited for high write throughput; others report mixed performance experiences with these systems.
  • Several note that MySQL or SQL Server can outperform Postgres on certain write-heavy or query-planning workloads, but licensing and cost make them unattractive for startups.

Sharding vs single primary

  • Strong debate over whether sharding is “just a DB concern” or necessarily leaks into the application via joins, cross-shard transactions, and consistency semantics.
  • Comments highlight that cross-shard operations often become non-transactional or rely on 2PC with eventual consistency and operational complexity (schema changes, resharding, observability).
  • Some argue OpenAI effectively did shard—just by moving workloads to different databases instead of sharded Postgres itself.

Replication, hardware, and infra

  • Curiosity about replication details: likely async streaming replication; concerns about lagging replicas causing WAL retention and potential slowdowns.
  • Alternatives discussed: shipping WAL to object storage and having replicas pull from there, with higher baseline lag and dependence on object-store performance.
  • Thread dives into massive Azure/AWS VM SKUs (hundreds–thousands of cores, tens of TB RAM), their high cost, and advice to prefer multiple “medium” boxes over giant NUMA monsters.

Operational lessons

  • Emphasis on “boring” techniques: connection pooling (pgbouncer), query optimization, caching, schema-change timeouts.
  • Anecdote on idle transactions exhausting connection slots and using compile-time checks to prevent holding connections across async waits.
  • One theme: Postgres scales very far if used mainly as a transactional “source of truth” while offloading search/analytics/discovery elsewhere.

Reception of the article

  • Some praise it as a grounded example that a single primary plus replicas can support enormous scale and that many companies over-shard prematurely.
  • Others criticize it as vague, repetitive, Azure/CosmosDB marketing with little novel technical detail, and point out the resulting multi-database complexity and lock-in.