Jepsen: Amazon RDS for PostgreSQL 17.4

Scope of the issue

  • The tested system is Amazon RDS for PostgreSQL multi‑AZ clusters (the newer “cluster” flavor with two readable standbys), not:
    • Single‑instance RDS
    • Classic multi‑AZ instance failover, nor plain upstream single‑node Postgres.
  • The key finding: multi‑AZ clusters violate snapshot isolation and behave more like “parallel snapshot isolation,” including “long fork” / fractured‑read style anomalies.
  • The anomalies occur on healthy systems, without fault injection.

Root cause and relation to upstream Postgres

  • Several commenters explain a subtle upstream behavior:
    • On the primary, visibility order is based on when the backend marks a transaction as committed.
    • On replicas, visibility is based on WAL commit record order.
    • These orders can diverge, so a replica can see transaction T but miss some transactions that logically happened before T.
  • This explains how a read‑only transaction on a replica can observe inconsistent snapshots even if the primary has proper snapshot isolation.
  • There is ongoing upstream work to improve cross‑node snapshot consistency, but it’s unfinished and involves serious tradeoffs (e.g., read‑your‑writes vs durability/latency).

Practical impact & example anomalies

  • It’s not just “slightly stale reads”; you can see states that could never arise in any serial or single‑snapshot execution.
  • Examples discussed:
    • Chained background updates (GPS → postal code → city) observed out of logical order (city updated without postal code, etc.).
    • “First commenter” / uniqueness checks granting the same badge to multiple users.
    • Git‑like “read‑check‑write” flows ending in hashes that don’t correspond to any valid state.
  • The risk is highest when applications:
    • Assume snapshot isolation,
    • Use read replicas (multi‑AZ reader endpoint) in logic that conditions writes on prior reads.

AWS guarantees, documentation, and tradeoffs

  • Upstream Postgres documents snapshot isolation; commenters argue AWS does not clearly state that multi‑AZ clusters weaken this.
  • Some see this as a bug or at least an undocumented deviation; others frame it as a deliberate performance/availability tradeoff in a distributed system with “no free lunch.”
  • Several expect AWS either to:
    • Fix the behavior (with potential latency/throughput costs), or
    • Explicitly document the weaker guarantees and recommended usage (e.g., critical transactions against the writer only).

RDS flavors and other systems

  • Confusion is noted between:
    • Multi‑AZ instances (classic synchronous replica for failover only), and
    • Multi‑AZ clusters (two readable standbys with quorum‑like behavior).
  • Some speculate that similar anomalies may appear in other Postgres replication setups, but this remains unclear; behavior is confirmed safe only for single‑node Postgres.
  • Aurora is discussed: its shared‑storage architecture differs, so its behavior may be different, but it was not tested here.

Reaction to Jepsen and writing style

  • Many praise the rigor and clarity of the Jepsen report and wish more vendor docs were equally precise.
  • Others find the style dense/academic and initially inaccessible; multiple replies offer explanations, learning advice, and suggest using LLMs or tutorials to bridge the gap.
  • One critical view claims the report lacks context and overstates failure; others counter that checking advertised guarantees against actual behavior is precisely the point.

Broader themes

  • Thread reiterates that distributed system guarantees (including major cloud offerings) are often weaker or more complex than users assume.
  • There is side discussion of other Jepsen targets (MongoDB, ZooKeeper, FoundationDB) and a desire for comprehensive Jepsen coverage of all RDS variants.
  • Several commenters note that many developers, even seniors and architects, do not understand isolation levels, which makes these subtle consistency issues especially dangerous in real applications.