Google's new pipe syntax in SQL

Topic drift: SQL pipes vs PDF-to-HTML

  • Several commenters note the HN title is misleading relative to the article, which quickly pivots from the Google SQL paper to using an LLM to turn PDFs into HTML/Markdown.
  • Some consider the PDF conversion demo more interesting than the SQL syntax itself; others are primarily there for the SQL proposal.

Pipe and FROM‑first SQL syntax

  • Many like the FROM‑first, piped style for complex analytical queries: it matches execution order, reads like a dataflow, and makes autocomplete and incremental query building easier.
  • Reported advantages: easier refactoring, multiple WHERE stages (pre/post aggregation), more natural mental model ("chain of filters and transforms"). One person refactored a ~500‑line, 20‑table query and preferred the new style.
  • Skeptics argue SELECT‑first improves legibility and troubleshooting because the projection and source tables are visible immediately. They question the need to change a 50‑year‑old, widely understood syntax.
  • There’s concern about nonstandard extensions; one embedded‑DB maintainer is explicitly waiting for the SQL standard and major engines (e.g., Postgres) before embracing FROM‑first syntax long‑term.

Relation to existing piped/query DSLs

  • Multiple comparisons to LINQ, Kusto, PRQL, dplyr/tidyverse, Kusto-like PQL, Flux, Ecto, PRQL-in-ClickHouse, and others.
  • Some view the Google proposal as a pragmatic, incremental change that can coexist with SQL; others see it as a too‑small reinvention given that richer non‑SQL DSLs already exist.
  • There’s a broader wish for a common SQL “core IR” (like MIR/CIR) and mention of Substrait as related work.

Syntax details and bikeshedding

  • GROUP BY ALL is praised for reducing boilerplate.
  • Combined clauses like GROUP AND ORDER BY are criticized as unnecessary complexity versus separate GROUP BY / ORDER BY.
  • Some want JSON‑style {key: value} inserts and universally allowed trailing commas.
  • Others dismiss the entire clause‑order debate as bikeshedding; they feel SQL is “fine” and already remarkably successful.

PDFs vs semantic HTML/Markdown

  • Long subthread on why PDFs are hard to copy from: glyph‑level layout, ligatures, legacy font handling, and inconsistent generators.
  • Some argue PDF was designed as a final rendered format; extraction quality depends on producers embedding proper mappings.
  • Disagreement over reading papers on phones: some prefer fixed two‑column PDFs; others strongly prefer reflowable HTML/EPUB and see paper‑optimized layouts as outdated.