Exploring PostgreSQL 18's new UUIDv7 support
UUIDs vs serial/bigserial
- Many argue
serial/bigserialis still ideal for a single Postgres instance: compact, fast indexes, simple semantics. - Reasons to prefer UUIDs: avoid leaking record counts/rates, generate IDs outside the DB, and make cross-system merging easier.
- A common pattern: bigint primary key internally, UUID (often v4) as external/opaque ID.
Client-side & distributed ID generation
- UUIDs support offline-first and multi-service architectures: clients or multiple backend nodes can generate IDs without round-trips or coordination.
- This enables idempotent writes (e.g., retries after network failures) and easier multi-database or multi-product correlation.
Security, enumeration, and “opaque” IDs
- Sequential IDs exposed in URLs/APIs make enumeration trivial and leak growth/usage patterns.
- Some treat unguessable IDs as an important extra security layer; others call this “security through obscurity” but still practically useful.
- Several insist correct authorization checks are the real fix; predictable IDs are only dangerous when auth is flawed.
UUIDv7 performance and locality
- Main value of v7 vs v4: monotonic ordering improves B‑tree locality and insert performance vs fully random v4, especially as tables/joins grow large.
- This matters more for write-heavy or huge tables; many report no issues with v4 even at very large scale, others have seen serious index bloat and cache misses.
- Monotonic IDs can be bad for some distributed databases (hot partitions) even as they help a single Postgres node.
Privacy & creation-time leakage
- v7 embeds a timestamp, so IDs reveal creation time. Debate centers on whether that’s a meaningful risk.
- Critics cite deanonymization, growth-rate inference, regulatory contexts (healthcare, HR), and links in insecure channels (email/SMS) as concerns.
- Others say most APIs already expose
created_at, and worrying about v7 here is overkill compared to more common threats (e.g., phishing).
Design patterns & mitigations
- Suggested patterns:
- v7 as internal PK + v4 as external ID (but this adds a second random index, partly eroding performance gains).
- Encrypt or format-preserving-encrypt v7 before exposing it, mapping opaque IDs to internal v7 without extra tables.
- Keep v4 everywhere for simplicity and uniformity unless performance clearly demands v7.
Alternatives & misc
- ULID, nanoid, TypeID, base32/base58 encodings discussed mainly for URL-friendliness and readability.
- Some note that any scheme balancing locality and unpredictability faces an inherent tradeoff; hashed/tweaked variants are floated but not standardized.
- Collision risk for client-generated v7 is widely considered negligible given the random bit budget.