Go ahead, self-host Postgres

When 24/7 Uptime Really Matters

  • Strong disagreement on how often “3 AM pages” are truly justified.
  • Some describe near-universal expectations of 24/7 availability (overnight batch jobs, banking/healthcare integrations, SLAs, reporting), even when humans aren’t working.
  • Others argue many important systems accept overnight or weekend downtime, have no pager rotation, and rely on manual fallbacks or VIP-only workarounds.
  • Uptime is also a sales/reputation lever: enterprises expect “always on” even if usage doesn’t strictly require it.

Self-Hosting vs Managed Postgres

  • Many report years or decades of trouble-free self-hosting with simple setups: single server, automated backups, basic monitoring.
  • Others emphasize that production-grade setups (backups, PITR, replicas, failover, upgrades, tuning) are nontrivial and time-consuming, especially without in-house DB expertise.
  • Managed services (RDS, Cloud SQL, AlloyDB, Supabase, etc.) are praised for backups, upgrades, monitoring, and reduced operational toil, but criticized as expensive and opaque, with limited control during incidents.
  • Both sides agree: managed DBs do not eliminate the need for database skills, disaster recovery planning, or backup verification.

High Availability and Clustering

  • Postgres is widely seen as lacking “batteries-included” HA compared to MongoDB’s replica sets.
  • Common HA tooling: Patroni, CloudNativePG, Zalando operator, Autobase, pg_auto_failover; but these add complexity and are not zero-downtime in all failure modes.
  • Some argue most businesses don’t actually need true zero-downtime HA; fast recovery plus occasional brief outages is acceptable. Others find that for critical workloads, Postgres HA remains too hard without specialist DBAs.

Backups, Monitoring, and Reliability

  • Consensus that no backup strategy (including RDS) should be blindly trusted; test restores regularly.
  • Tools mentioned: pgBackRest, Barman, ZFS snapshots, WAL archiving, pgdash, netdata, pganalyze.
  • A recurring failure mode: running out of disk space on managed or self-hosted nodes, leading to painful recovery.

Performance, Latency, and Cost

  • Self-hosted Postgres on bare metal or cheap VPS/Hetzner-style servers with NVMe is reported to vastly outperform cloud-managed offerings at a fraction of the price.
  • Network latency between app and DB can dominate query time; colocating them (same host or LAN) yields large speedups.
  • For small projects, some advocate SQLite + Litestream instead of any networked database.

People, Skills, and Responsibility

  • Management often prefers big-name cloud/SaaS for blame-shifting and reduced “bus factor,” even if cost is higher.
  • Others argue companies overpay for cloud while still needing infra engineers; black-box debugging of managed services can be as hard as self-hosting.
  • Several lament that basic sysadmin skills (Unix, RAID, backups) are now seen as exotic, and that fear of terminals helps drive adoption of expensive managed databases.