Getting AI to write good SQL

Semantic layers and JSON vs raw SQL

  • One camp argues the key to reliable text-to-SQL is a semantic layer: pre-defined metrics, dimensions, and joins that encode business meaning (“what is MAU?”) and shield LLMs from raw schemas.
  • Proponents say LLMs are much more consistent emitting small, structured JSON query specs than long SQL strings; the JSON is then compiled to SQL.
  • Many others react strongly against “writing queries in JSON”, calling it tail-wagging-the-dog and pointing out that plain SQL is already a declarative, semantic layer with mature tooling. They compare JSON-based query ASTs to ORMs and query builders—useful for machines, unpleasant for humans.

Effectiveness of text-to-SQL in practice

  • Several commenters say: for modest schemas, “give the model the DDL and a clear question” works surprisingly well, especially with modern models (o3, GPT‑4o, Claude, Gemini, etc.).
  • Others report poor results, especially with BigQuery’s Gemini integration or large, undocumented, constraint-free warehouses: wrong joins, hallucinated columns, non-performant queries.
  • Text-to-SQL is seen as near-solved for toy demos, but hard in “real life” with thousands of tables, denormalized messes, and business logic encoded out-of-band.

Business context, intent, and semantic drift

  • Multiple threads emphasize that understanding user intent and business semantics is harder than generating syntactically valid SQL.
  • Metrics definitions, messy legacy schemas, and ambiguous terminology still require human data/analytics expertise; no amount of text-to-SQL gloss can answer high-level “why” questions without that groundwork.
  • Semantic layers / ontologies are proposed as a bridge: humans curate metrics and relationships; LLMs operate over that layer.

Safety, performance, and governance

  • Concerns: non-experts can now generate heavy queries that hurt production systems; LLMs rarely optimize for SARGability, indexes, or locking.
  • Suggested mitigations: read-only replicas, workload management queues, dry-run and parsing of generated SQL, mandatory expert review for anything impactful.
  • Several experienced SQL users find LLMs useful for brainstorming or boilerplate, but still faster and safer to hand-write and optimize serious queries.

Broader AI/tooling discourse

  • Side discussions compare text-to-SQL with “text to regex/shell”: skeptics argue you still need enough expertise to specify and verify correctness, so these tools mostly amplify experts rather than replace them.
  • Gemini 2.5 receives both strong praise (“game changing for coding/SQL”) and strong criticism (“hallucinates APIs, overcomments, feels like marketing hype”).
  • There’s debate over hype, job displacement, and whether AI democratizes programming or erodes hard-won expertise.