Go ahead, self-host Postgres
When 24/7 Uptime Really Matters
- Strong disagreement on how often “3 AM pages” are truly justified.
- Some describe near-universal expectations of 24/7 availability (overnight batch jobs, banking/healthcare integrations, SLAs, reporting), even when humans aren’t working.
- Others argue many important systems accept overnight or weekend downtime, have no pager rotation, and rely on manual fallbacks or VIP-only workarounds.
- Uptime is also a sales/reputation lever: enterprises expect “always on” even if usage doesn’t strictly require it.
Self-Hosting vs Managed Postgres
- Many report years or decades of trouble-free self-hosting with simple setups: single server, automated backups, basic monitoring.
- Others emphasize that production-grade setups (backups, PITR, replicas, failover, upgrades, tuning) are nontrivial and time-consuming, especially without in-house DB expertise.
- Managed services (RDS, Cloud SQL, AlloyDB, Supabase, etc.) are praised for backups, upgrades, monitoring, and reduced operational toil, but criticized as expensive and opaque, with limited control during incidents.
- Both sides agree: managed DBs do not eliminate the need for database skills, disaster recovery planning, or backup verification.
High Availability and Clustering
- Postgres is widely seen as lacking “batteries-included” HA compared to MongoDB’s replica sets.
- Common HA tooling: Patroni, CloudNativePG, Zalando operator, Autobase, pg_auto_failover; but these add complexity and are not zero-downtime in all failure modes.
- Some argue most businesses don’t actually need true zero-downtime HA; fast recovery plus occasional brief outages is acceptable. Others find that for critical workloads, Postgres HA remains too hard without specialist DBAs.
Backups, Monitoring, and Reliability
- Consensus that no backup strategy (including RDS) should be blindly trusted; test restores regularly.
- Tools mentioned: pgBackRest, Barman, ZFS snapshots, WAL archiving, pgdash, netdata, pganalyze.
- A recurring failure mode: running out of disk space on managed or self-hosted nodes, leading to painful recovery.
Performance, Latency, and Cost
- Self-hosted Postgres on bare metal or cheap VPS/Hetzner-style servers with NVMe is reported to vastly outperform cloud-managed offerings at a fraction of the price.
- Network latency between app and DB can dominate query time; colocating them (same host or LAN) yields large speedups.
- For small projects, some advocate SQLite + Litestream instead of any networked database.
People, Skills, and Responsibility
- Management often prefers big-name cloud/SaaS for blame-shifting and reduced “bus factor,” even if cost is higher.
- Others argue companies overpay for cloud while still needing infra engineers; black-box debugging of managed services can be as hard as self-hosting.
- Several lament that basic sysadmin skills (Unix, RAID, backups) are now seen as exotic, and that fear of terminals helps drive adoption of expensive managed databases.