Jepsen: Amazon RDS for PostgreSQL 17.4
Scope of the issue
- The tested system is Amazon RDS for PostgreSQL multi‑AZ clusters (the newer “cluster” flavor with two readable standbys), not:
- Single‑instance RDS
- Classic multi‑AZ instance failover, nor plain upstream single‑node Postgres.
- The key finding: multi‑AZ clusters violate snapshot isolation and behave more like “parallel snapshot isolation,” including “long fork” / fractured‑read style anomalies.
- The anomalies occur on healthy systems, without fault injection.
Root cause and relation to upstream Postgres
- Several commenters explain a subtle upstream behavior:
- On the primary, visibility order is based on when the backend marks a transaction as committed.
- On replicas, visibility is based on WAL commit record order.
- These orders can diverge, so a replica can see transaction T but miss some transactions that logically happened before T.
- This explains how a read‑only transaction on a replica can observe inconsistent snapshots even if the primary has proper snapshot isolation.
- There is ongoing upstream work to improve cross‑node snapshot consistency, but it’s unfinished and involves serious tradeoffs (e.g., read‑your‑writes vs durability/latency).
Practical impact & example anomalies
- It’s not just “slightly stale reads”; you can see states that could never arise in any serial or single‑snapshot execution.
- Examples discussed:
- Chained background updates (GPS → postal code → city) observed out of logical order (city updated without postal code, etc.).
- “First commenter” / uniqueness checks granting the same badge to multiple users.
- Git‑like “read‑check‑write” flows ending in hashes that don’t correspond to any valid state.
- The risk is highest when applications:
- Assume snapshot isolation,
- Use read replicas (multi‑AZ reader endpoint) in logic that conditions writes on prior reads.
AWS guarantees, documentation, and tradeoffs
- Upstream Postgres documents snapshot isolation; commenters argue AWS does not clearly state that multi‑AZ clusters weaken this.
- Some see this as a bug or at least an undocumented deviation; others frame it as a deliberate performance/availability tradeoff in a distributed system with “no free lunch.”
- Several expect AWS either to:
- Fix the behavior (with potential latency/throughput costs), or
- Explicitly document the weaker guarantees and recommended usage (e.g., critical transactions against the writer only).
RDS flavors and other systems
- Confusion is noted between:
- Multi‑AZ instances (classic synchronous replica for failover only), and
- Multi‑AZ clusters (two readable standbys with quorum‑like behavior).
- Some speculate that similar anomalies may appear in other Postgres replication setups, but this remains unclear; behavior is confirmed safe only for single‑node Postgres.
- Aurora is discussed: its shared‑storage architecture differs, so its behavior may be different, but it was not tested here.
Reaction to Jepsen and writing style
- Many praise the rigor and clarity of the Jepsen report and wish more vendor docs were equally precise.
- Others find the style dense/academic and initially inaccessible; multiple replies offer explanations, learning advice, and suggest using LLMs or tutorials to bridge the gap.
- One critical view claims the report lacks context and overstates failure; others counter that checking advertised guarantees against actual behavior is precisely the point.
Broader themes
- Thread reiterates that distributed system guarantees (including major cloud offerings) are often weaker or more complex than users assume.
- There is side discussion of other Jepsen targets (MongoDB, ZooKeeper, FoundationDB) and a desire for comprehensive Jepsen coverage of all RDS variants.
- Several commenters note that many developers, even seniors and architects, do not understand isolation levels, which makes these subtle consistency issues especially dangerous in real applications.