2026-05-08

Ask HN: We just had an actual UUID v4 collision...

Collision probability & randomness

Commenters stress that UUIDv4 collisions are astronomically unlikely but not impossible; odds like “1 in ~10²⁸” are cited.
Many argue a bug or bad RNG is vastly more likely than a true random collision at ~15k records.
Several note confusion between probability and certainty, and correct gambler’s fallacy errors: a collision doesn’t change lottery odds; events are independent.

Suspected root causes

Strong consensus that the cause is a broken or weak entropy source:
- Poorly seeded PRNGs on cheap/embedded devices or VMs.
- Browser shims/polyfills for node:crypto, especially on mobile or constrained environments.
- Deterministic “random” in bots (e.g., Googlebot) leading to duplicate UUIDs.
- Virtualization “virtualizing entropy away” or snapshotting RNG state.
A specific uuid npm package change is flagged: rng() reusing a single Uint8Array instead of returning a copy, creating a foot-gun if misused.

Client vs server UUID generation

Many criticize generating UUIDs on client devices or letting users provide IDs, citing manipulation and weak RNG.
Others argue client-side UUIDs are fine if validated and backed by uniqueness constraints.
There are concrete anecdotes of analytics systems based on browser-generated UUIDs suffering widespread collisions.

Alternatives & UUID versions

Timestamp-based or structured IDs are discussed:
- UUIDv1/v7, ULID, Snowflake-like schemes, database sequences, and AES-encrypted counters.
- Pros: sortability, reduced dependence on entropy, easier collision reasoning.
- Cons: time leakage (privacy/side-channel), clock drift, fewer random bits per ID.
Some claim v7 would make a “collision like this” impossible; others counter it still has non-zero collision probability, especially with high volume per millisecond.

Handling collisions in practice

Several recommend always planning for collisions:
- Unique indexes in the DB, retry-on-conflict loops, or generator-side checks against a cache.
Others note that the original appeal of UUIDv4 was precisely to avoid centralized checks, and that checking at scale can be costly.

Entropy quality & high-reliability views

High-reliability systems often avoid pure entropy-based IDs because detecting RNG failure is hard.
Entropy sources (hardware noise, radiation, “lava lamp walls”) are discussed; more entropy is seen as good, but hard to verify in production.

Cultural / architectural critiques

Multiple anecdotes mock over-engineered UUID microservices and KPI-driven team growth, using this incident to highlight misplaced complexity and risk-blind design.

Related topics