The principles of database design, or, the Truth is out there

Normalization, performance, and usability

  • Many commenters argue the article over-emphasizes full normalization (3NF/5NF/6NF).
  • Common practice described: aim for ~3NF (sometimes BCNF) and selectively denormalize for performance, especially for reporting and analytics.
  • Some find highly normalized schemas painful for ad‑hoc queries (too many joins, especially in meetings or exploratory analysis).
  • Others counter that joins are intrinsic to the relational model; if you dislike joins, you may want a different data store or a dedicated reporting schema/views.
  • Skeptics of “Principle of Full Normalization” see it as ideological purity that ignores trade‑offs, hardware realities, and domain ambiguity.

Natural vs surrogate keys (PED principle)

  • The proposed “Principle of Essential Denotation” (use natural keys as identifiers) is heavily contested.
  • Multiple commenters report long, painful experiences where natural keys (e.g., national IDs, ISBNs, job codes, user-defined names) turned out to be:
    • Non-unique, reused, or duplicated.
    • Mutable due to policy changes, errors, or real‑world events (adoption, fraud, schema changes).
  • Many strongly prefer surrogate keys (typically integers) as primary keys, with natural keys enforced via unique constraints where appropriate.
  • Others note that natural keys are still important conceptually; danger is when they are ignored entirely and no constraints are enforced.

Security, exposure of identifiers, and UUIDs

  • Debate over having separate internal IDs (e.g., bigint) and external IDs (UUIDv4/v7, encrypted IDs).
  • One side: don’t expose internal sequential IDs; they reveal scale and enable inference attacks (German tank problem–style reasoning). Random or encrypted external IDs are seen as a useful extra layer.
  • Opposing view: relying on obscurity is fragile; proper authorization should make exposed IDs harmless, and adding multiple IDs increases complexity and query cost.
  • Some suggest encrypting internal IDs into 128‑bit opaque tokens, UUID-like but not standard, as a cheap way to avoid leaking internal state.

Messy reality and “natural” keys

  • Many anecdotes show supposedly “natural” identifiers are messy: duplicate or invalid ISBNs, conflicting government job codes, non-unique SSNs/national IDs, twins with near-identical attributes, people lacking documents, changing ID formats.
  • Takeaway: any externally assigned ID is risky as a primary key; treat such values as domain data, not core identity.

ORMs, theory vs practice, and domain complexity

  • Some principles and examples are said to map poorly to mainstream ORMs; developers relying solely on ORMs may miss important SQL features and optimizations.
  • Commenters highlight domains where:
    • Identity is probabilistic (entity resolution), so no clear natural key exists.
    • Multiple representations are inherently required (e.g., spatial vs graph structures in mapping).
  • Several criticize the article’s philosophical framing (“truth”, “representation of reality”) as overly theoretical; they emphasize pragmatic schemas that are efficient, evolvable, and reflect messy business rules rather than an idealized model.