2024-08-20

Making database systems usable

LLMs and Natural-Language Interfaces

Thread centers on whether LLMs can make databases more usable, especially via natural-language querying.
NL→SQL is described as far from solved; benchmark accuracy figures around high‑60% are seen as too low to rely on.
Some suggest targeting other languages (Datalog, QUEL, imperative or bytecode formats), arguing SQL is “just an implementation detail.”
Others counter that the hard part is interpreting messy natural language, not SQL’s quirks, and that users and auditors need inspectable, verifiable queries (SQL or similar).
A pragmatic approach mentioned: LLMs selecting among “canned” parameterized queries (natural-language BI) rather than generating arbitrary SQL.

Usability vs Operations in Databases

Distinction drawn between:
- “Production” schemas (normalized, strict, optimized for apps).
- “Play/analysis” schemas (denormalized, nullable, friendly for non‑programmers).
Some think you can mechanically derive a user‑friendly schema from a normalized one, but not the reverse; others say going from messy/flat to well‑designed schemas inherently requires domain expertise.
Operational usability (backup/restore, schema migrations, bulk loads) is emphasized as critical and often neglected; many complain vendors leave zero‑downtime migrations and operational hacks to third‑party tools.
Ideas to improve usability: column stores with instant schema changes, git‑style branching/merging of databases, and integrated form builders.

Learning SQL and the Problem with JOINs

Many participants say developers irrationally avoid learning SQL and JOINs, leading to ORMs and NoSQL “to avoid thinking about joins.”
Others push back that SQL’s inconsistencies and vendor differences raise the real learning curve.
JOINs are cited as a major usability pain: people fear accidental row multiplication and ambiguous join paths.
Some wish the system could infer multi‑table join paths from foreign keys; critics argue semantics are often ambiguous, so human intent is still needed.
ORMs (Django, SQLAlchemy, EdgeDB) and features like dot‑notation querying help hide joins but don’t remove conceptual complexity.

Relational Modeling vs Nested/Document Models

One camp complains relational modeling forces “everything as tables,” lacks first‑class nested structures (lists, embedded objects), and leads to table explosion for simple concepts.
Others argue relational normalization enforces clarity and integrity; document stores are seen as easier initially but dangerous long‑term (schema drift, hidden invariants in app code).
Several note that modern SQL already supports arrays, composite types, JSON, and range types (especially in Postgres and BigQuery).
Debate continues over 1NF and whether non‑scalar fields “break” the relational model; some say violating normal forms is acceptable when done consciously.
A broader “object–relational impedance mismatch” is discussed: it’s cognitively easier to think in objects, but relational storage is operationally superior.

Tools, Higher-Level Languages, and Alternatives

Examples raised: natural-language BI over canned queries, commercial NL→SQL tools, spreadsheet‑like DB frontends, and Access‑style UI builders.
Some advocate new query languages that feel more like general-purpose programming or object navigation (e.g., dot‑chaining across relations) instead of raw SQL.
Others insist that attempts to replace SQL have not yet delivered compelling advantages relative to SQL’s maturity and ubiquity.

Indexes and Performance Tuning

A subthread discusses how to design indexes: start with obvious heavy‑use columns and foreign keys, then refine based on slow queries and EXPLAIN/analysis tools.
Participants stress data‑dependent behavior: high‑cardinality columns, NULL handling, ORDER BY interactions, and composite index ordering all matter.
Some note that certain systems can suggest or even auto‑create indexes, but manual understanding remains important.

Broader UX & Historical Perspectives

Comparisons drawn between agents vs self‑service (e.g., travel agents vs flight websites) to illustrate trade‑offs between convenience, control, and cognitive load.
Several point out that end‑user database UIs (e.g., classic desktop tools) already solved many usability problems that modern tooling neglected.
There is general agreement that small reductions in friction can massively increase adoption; “zero effort” features are disproportionately valuable.

Related topics