2025-05-16

Getting AI to write good SQL

Semantic layers and JSON vs raw SQL

One camp argues the key to reliable text-to-SQL is a semantic layer: pre-defined metrics, dimensions, and joins that encode business meaning (“what is MAU?”) and shield LLMs from raw schemas.
Proponents say LLMs are much more consistent emitting small, structured JSON query specs than long SQL strings; the JSON is then compiled to SQL.
Many others react strongly against “writing queries in JSON”, calling it tail-wagging-the-dog and pointing out that plain SQL is already a declarative, semantic layer with mature tooling. They compare JSON-based query ASTs to ORMs and query builders—useful for machines, unpleasant for humans.

Effectiveness of text-to-SQL in practice

Several commenters say: for modest schemas, “give the model the DDL and a clear question” works surprisingly well, especially with modern models (o3, GPT‑4o, Claude, Gemini, etc.).
Others report poor results, especially with BigQuery’s Gemini integration or large, undocumented, constraint-free warehouses: wrong joins, hallucinated columns, non-performant queries.
Text-to-SQL is seen as near-solved for toy demos, but hard in “real life” with thousands of tables, denormalized messes, and business logic encoded out-of-band.

Business context, intent, and semantic drift

Multiple threads emphasize that understanding user intent and business semantics is harder than generating syntactically valid SQL.
Metrics definitions, messy legacy schemas, and ambiguous terminology still require human data/analytics expertise; no amount of text-to-SQL gloss can answer high-level “why” questions without that groundwork.
Semantic layers / ontologies are proposed as a bridge: humans curate metrics and relationships; LLMs operate over that layer.

Safety, performance, and governance

Concerns: non-experts can now generate heavy queries that hurt production systems; LLMs rarely optimize for SARGability, indexes, or locking.
Suggested mitigations: read-only replicas, workload management queues, dry-run and parsing of generated SQL, mandatory expert review for anything impactful.
Several experienced SQL users find LLMs useful for brainstorming or boilerplate, but still faster and safer to hand-write and optimize serious queries.

Broader AI/tooling discourse

Side discussions compare text-to-SQL with “text to regex/shell”: skeptics argue you still need enough expertise to specify and verify correctness, so these tools mostly amplify experts rather than replace them.
Gemini 2.5 receives both strong praise (“game changing for coding/SQL”) and strong criticism (“hallucinates APIs, overcomments, feels like marketing hype”).
There’s debate over hype, job displacement, and whether AI democratizes programming or erodes hard-won expertise.

Related topics