SQL nulls are weird

What NULL Represents

  • Strong debate over whether SQL NULL means “unknown value,” “no value,” “missing data,” or an overloaded mix of all three.
  • Some argue the original relational notion is “unknown,” and that calling it NULL was an unfortunate naming choice; others insist “no value” or “absence” is a more practical mental model.
  • Several point out that conflating “unknown” and “known absent” is what makes NULL so confusing in real systems.

Three‑Valued Logic and Query Semantics

  • Many comments explain SQL’s three-valued logic (TRUE/FALSE/UNKNOWN) and show how comparisons with NULL yield UNKNOWN, which WHERE treats as “not selected.”
  • This leads to surprising behavior: = NULL / <> NULL never match; x = value silently excludes NULL rows unless explicitly guarded.
  • Some see this as mathematically consistent (e.g., Kleene logic); others call it ergonomically “clownish,” especially since X = X can be UNKNOWN.

Uniqueness, DISTINCT, GROUP BY

  • Key confusion: UNIQUE constraints treat multiple NULLs as distinct, while SELECT DISTINCT / GROUP BY group all NULLs together.
  • Some justify this as “set/group semantics”: all unknowns form one group, but are not equal as values.
  • Others find this inconsistent and argue it forces awkward patterns and mental overhead.

Database-Specific Behavior

  • Postgres 15+ allows NULLS NOT DISTINCT in unique indexes, giving control over NULL uniqueness.
  • IS [NOT] DISTINCT FROM (or dialect equivalents) is highlighted as the “correct” equality operator when NULLs must compare as equal.
  • Oracle’s empty‑string‑equals‑NULL behavior is widely criticized as especially bizarre.
  • Different engines (Postgres, MySQL/MariaDB, SQL Server, SQLite, Oracle) diverge on indexing, uniqueness, and NULL treatment, adding portability pain.

Schema Design and Modeling Choices

  • Academic relational purists argue NULL should rarely exist; optional data should be modeled as separate tables (1:0..1 relations).
  • Practitioners push back: this level of normalization is often impractical and verbose, so nullable columns are the de facto solution.
  • Soft deletes using deleted_at IS NULL are debated: some call it overloading and a design smell; others say it’s fine with proper indexes/views.

Alternatives, Workarounds, and Ergonomics

  • Suggested mitigations: COALESCE/IFNULL, IS [NOT] DISTINCT FROM, partial/filtered indexes, views for “active” rows.
  • Some wish for richer type systems (sum types, Option/Maybe, multiple null kinds) or even SQL replacements (e.g., systems that avoid NULL entirely).
  • Several note that NULL’s semantics are logically defensible but ergonomically hostile, especially for developers coming from “simple” language nulls and ORMs.