SQL nulls are weird
What NULL Represents
- Strong debate over whether SQL NULL means “unknown value,” “no value,” “missing data,” or an overloaded mix of all three.
- Some argue the original relational notion is “unknown,” and that calling it NULL was an unfortunate naming choice; others insist “no value” or “absence” is a more practical mental model.
- Several point out that conflating “unknown” and “known absent” is what makes NULL so confusing in real systems.
Three‑Valued Logic and Query Semantics
- Many comments explain SQL’s three-valued logic (TRUE/FALSE/UNKNOWN) and show how comparisons with NULL yield UNKNOWN, which WHERE treats as “not selected.”
- This leads to surprising behavior:
= NULL/<> NULLnever match;x = valuesilently excludes NULL rows unless explicitly guarded. - Some see this as mathematically consistent (e.g., Kleene logic); others call it ergonomically “clownish,” especially since
X = Xcan be UNKNOWN.
Uniqueness, DISTINCT, GROUP BY
- Key confusion: UNIQUE constraints treat multiple NULLs as distinct, while
SELECT DISTINCT/GROUP BYgroup all NULLs together. - Some justify this as “set/group semantics”: all unknowns form one group, but are not equal as values.
- Others find this inconsistent and argue it forces awkward patterns and mental overhead.
Database-Specific Behavior
- Postgres 15+ allows
NULLS NOT DISTINCTin unique indexes, giving control over NULL uniqueness. IS [NOT] DISTINCT FROM(or dialect equivalents) is highlighted as the “correct” equality operator when NULLs must compare as equal.- Oracle’s empty‑string‑equals‑NULL behavior is widely criticized as especially bizarre.
- Different engines (Postgres, MySQL/MariaDB, SQL Server, SQLite, Oracle) diverge on indexing, uniqueness, and NULL treatment, adding portability pain.
Schema Design and Modeling Choices
- Academic relational purists argue NULL should rarely exist; optional data should be modeled as separate tables (1:0..1 relations).
- Practitioners push back: this level of normalization is often impractical and verbose, so nullable columns are the de facto solution.
- Soft deletes using
deleted_at IS NULLare debated: some call it overloading and a design smell; others say it’s fine with proper indexes/views.
Alternatives, Workarounds, and Ergonomics
- Suggested mitigations: COALESCE/IFNULL,
IS [NOT] DISTINCT FROM, partial/filtered indexes, views for “active” rows. - Some wish for richer type systems (sum types, Option/Maybe, multiple null kinds) or even SQL replacements (e.g., systems that avoid NULL entirely).
- Several note that NULL’s semantics are logically defensible but ergonomically hostile, especially for developers coming from “simple” language nulls and ORMs.