No such thing as exactly-once delivery

Core distinction: “delivery” vs “processing”

  • Major thread theme: people conflate “message delivery” with “message processing / committing side effects.”
  • One camp insists “exactly-once delivery” is impossible in failure-prone distributed systems.
  • Another says you can get “exactly-once processing” via idempotency, deduplication, counters, and transactions, as long as you acknowledge this is different from transport-level delivery.
  • Side effects (emails, database writes, external APIs) are where guarantees usually break down.

Limits, failures, and probabilities

  • Several comments stress that even “at-least-once” cannot be guaranteed in finite time when nodes, networks, or power can fail arbitrarily or partitions persist.
  • Systems can only drive the probability of loss/duplication arbitrarily low, not to zero.
  • References to Byzantine Generals and CAP: global, time-bounded exactly-once is provably impossible under realistic assumptions.

Examples: TCP, queues, email, HFT

  • TCP is described as:
    • Within one connection: data never delivered twice by definition.
    • From the app’s perspective: at-most-once, because data can be lost on failures.
  • Streaming frameworks (Kafka, Kinesis, Flink, Beam, Kafka Streams) use offsets/checkpoints to approximate exactly-once processing over at-least-once delivery.
  • Email’s Message-Id is cited as an idempotency key for deduplication.
  • High-frequency trading example: strict latency budgets make even at-least-once impossible to guarantee.

Idempotency, transactions, and system boundaries

  • Repeated point: you can build reliable, transactional behavior on unreliable components, but you pay with complexity and cross-layer logic.
  • Exactly-once processing is achievable inside a transactional boundary; crossing boundaries requires idempotency keys and careful coordination.
  • Chaining two “exactly-once” subsystems via a stateless middle still requires end-to-end idempotency.

Filesystem and low-level guarantees

  • Debate over whether file renames across directories are truly atomic and durable in crashes.
  • Distinction between POSIX-level atomicity from a process’s view and on-disk reality under crashes or in distributed filesystems.
  • Conclusion: even with “atomic” primitives, crash timing can still reintroduce duplicates or ambiguity.

Semantics, marketing, and practice

  • Several comments criticize vendors who advertise “exactly-once delivery,” arguing it’s really “exactly-once for practical purposes” or “inside our processing model.”
  • Some argue that if a higher layer only ever sees each message once, that’s effectively exactly-once; others insist terminology must reflect theoretical limits.
  • Anecdotes show real systems often have much higher duplicate rates than expected, and many apps assume exactly-once without monitoring or checks.