2026-01-22

Scaling PostgreSQL to power 800M ChatGPT users

PostgreSQL-at-scale architecture

Core setup: one PostgreSQL primary handling all writes plus ~50 read replicas; read-heavy traffic is offloaded, write-heavy shardable workloads are moved to Azure CosmosDB and other sharded systems.
New tables are no longer added to the main Postgres deployment; new features default to sharded systems.

Write scalability and MVCC limits

Discussion centers on MVCC causing write and read amplification, bloat, and autovacuum complexity under heavy write load.
Some point to LSM-tree systems (e.g., TiDB, RocksDB-style designs) as better suited for high write throughput; others report mixed performance experiences with these systems.
Several note that MySQL or SQL Server can outperform Postgres on certain write-heavy or query-planning workloads, but licensing and cost make them unattractive for startups.

Sharding vs single primary

Strong debate over whether sharding is “just a DB concern” or necessarily leaks into the application via joins, cross-shard transactions, and consistency semantics.
Comments highlight that cross-shard operations often become non-transactional or rely on 2PC with eventual consistency and operational complexity (schema changes, resharding, observability).
Some argue OpenAI effectively did shard—just by moving workloads to different databases instead of sharded Postgres itself.

Replication, hardware, and infra

Curiosity about replication details: likely async streaming replication; concerns about lagging replicas causing WAL retention and potential slowdowns.
Alternatives discussed: shipping WAL to object storage and having replicas pull from there, with higher baseline lag and dependence on object-store performance.
Thread dives into massive Azure/AWS VM SKUs (hundreds–thousands of cores, tens of TB RAM), their high cost, and advice to prefer multiple “medium” boxes over giant NUMA monsters.

Operational lessons

Emphasis on “boring” techniques: connection pooling (pgbouncer), query optimization, caching, schema-change timeouts.
Anecdote on idle transactions exhausting connection slots and using compile-time checks to prevent holding connections across async waits.
One theme: Postgres scales very far if used mainly as a transactional “source of truth” while offloading search/analytics/discovery elsewhere.

Reception of the article

Some praise it as a grounded example that a single primary plus replicas can support enormous scale and that many companies over-shard prematurely.
Others criticize it as vague, repetitive, Azure/CosmosDB marketing with little novel technical detail, and point out the resulting multi-database complexity and lock-in.

Related topics