2025-10-29

Kafka is Fast – I'll use Postgres

Postgres as the Default Tool

Many commenters strongly endorse “start with Postgres” for startups and small/medium systems: one database, simple ops, huge ecosystem, and good enough performance for thousands of users and millions of events/day.
Several note Rails 8 and other stacks are leaning into this: background jobs, caching, and sockets all backed by Postgres to reduce moving parts.
Postgres-based queues (e.g., pgmq, PGQueuer, custom SKIP LOCKED tables) are reported to work well up to ~5–10k msg/s and millions of jobs/day.

Caveats to “Use Postgres for Everything”

Commenters stress you must understand Postgres’ locking, isolation, VACUUM, and write amplification; naive “just shove it in a table” can become a bottleneck under heavy write or contention.
LISTEN/NOTIFY and polling don’t scale arbitrarily; high-frequency, delete-heavy queues can lead to vacuum and index bloat issues.
Using the same instance for OLTP data and queues can cause interference; some split into separate DBs/servers once load grows.

Kafka’s Strengths and Misuse

Kafka is praised for:
- Handling very high throughput (hundreds of thousands to millions of msgs/s reported on modest hardware).
- Durable event logs with per-consumer offsets, consumer groups, and ability to replay/rewind.
- Enabling multi-team, event-driven architectures and parallel development.
Critiques:
- Operational and organizational overhead (clusters, tuning, client configs, rebalancing, vendor lock-in, cost of managed services).
- Often introduced for “resume-driven design” or vague future scale instead of current need.
- Frequently misused as a work queue; lack of native per-message NACK/DLQ semantics leads to tricky error handling.

Queues vs Pub/Sub vs Event Logs

Several distinguish:
- Work queues: one consumer handles each job, message typically deleted.
- Pub/sub logs: durable, append-only streams, many consumers each track their own cursor.
Implementing Kafka-like event logs in Postgres is possible but non-trivial:
- Need monotonic sequence numbers that don’t skip on aborted transactions.
- Requires careful transaction design (counter tables, triggers, or logical time schemes) and client libraries to manage offsets.
- Tooling and client ergonomics are currently much weaker than the Kafka ecosystem.

Broader Themes

Ongoing tension between:
- Chasing new tech for scale/career vs overfitting a favorite tool (Postgres, Kafka, etc.).
- Performance purity vs total cost (complexity, ops, hiring, recovery, migration).
Several argue the real skill is knowing when Postgres is “good enough for now” and when concrete scaling pain justifies Kafka or other specialized systems.

Related topics