Kafka is Fast – I'll use Postgres
Postgres as the Default Tool
- Many commenters strongly endorse “start with Postgres” for startups and small/medium systems: one database, simple ops, huge ecosystem, and good enough performance for thousands of users and millions of events/day.
- Several note Rails 8 and other stacks are leaning into this: background jobs, caching, and sockets all backed by Postgres to reduce moving parts.
- Postgres-based queues (e.g., pgmq, PGQueuer, custom SKIP LOCKED tables) are reported to work well up to ~5–10k msg/s and millions of jobs/day.
Caveats to “Use Postgres for Everything”
- Commenters stress you must understand Postgres’ locking, isolation, VACUUM, and write amplification; naive “just shove it in a table” can become a bottleneck under heavy write or contention.
- LISTEN/NOTIFY and polling don’t scale arbitrarily; high-frequency, delete-heavy queues can lead to vacuum and index bloat issues.
- Using the same instance for OLTP data and queues can cause interference; some split into separate DBs/servers once load grows.
Kafka’s Strengths and Misuse
- Kafka is praised for:
- Handling very high throughput (hundreds of thousands to millions of msgs/s reported on modest hardware).
- Durable event logs with per-consumer offsets, consumer groups, and ability to replay/rewind.
- Enabling multi-team, event-driven architectures and parallel development.
- Critiques:
- Operational and organizational overhead (clusters, tuning, client configs, rebalancing, vendor lock-in, cost of managed services).
- Often introduced for “resume-driven design” or vague future scale instead of current need.
- Frequently misused as a work queue; lack of native per-message NACK/DLQ semantics leads to tricky error handling.
Queues vs Pub/Sub vs Event Logs
- Several distinguish:
- Work queues: one consumer handles each job, message typically deleted.
- Pub/sub logs: durable, append-only streams, many consumers each track their own cursor.
- Implementing Kafka-like event logs in Postgres is possible but non-trivial:
- Need monotonic sequence numbers that don’t skip on aborted transactions.
- Requires careful transaction design (counter tables, triggers, or logical time schemes) and client libraries to manage offsets.
- Tooling and client ergonomics are currently much weaker than the Kafka ecosystem.
Broader Themes
- Ongoing tension between:
- Chasing new tech for scale/career vs overfitting a favorite tool (Postgres, Kafka, etc.).
- Performance purity vs total cost (complexity, ops, hiring, recovery, migration).
- Several argue the real skill is knowing when Postgres is “good enough for now” and when concrete scaling pain justifies Kafka or other specialized systems.