2025-01-02

Advent of Code 2024 in pure SQL

Overall reaction

Many find solving Advent of Code in pure SQL both impressive and slightly “cursed” — admiration mixed with discomfort.
Several commenters describe it as “art” or “high wizardry,” especially given the constraints of standard SQL.
Some are inspired to share their own AoC experiments (SQL, ClickHouse, Sheets, EdgeQL, Cypher).

SQL, recursion, and logic programming

Recursive CTEs are highlighted as the core enabler; they make SQL feel like a logic language, often compared to Prolog or Datalog.
Some note that specific engines (e.g., Postgres) restrict recursion more than others; others mention that SQL Server’s recursive CTEs are more flexible.
There’s interest in better Datalog-style support and more efficient evaluation strategies than existing recursive implementations.

Where to put complex logic (DB vs application)

One camp argues RDBMSs are excellent places for business logic: schemas encode rules, set-based logic is concise, and computation “comes to the data.”
Another camp warns about overloading the DB: hard-to-maintain stored procedures, hidden triggers, opaque invariants, and painful debugging.
Several note that complex SQL is only viable if many team members are strong in SQL; otherwise, long-lived systems can become unmaintainable.

Ergonomics, tooling, and testability

SQL is praised as dense, declarative, and powerful, but criticized for:
- Poor tooling, weak editor support, and limited debugging.
- Difficult automated testing relative to general-purpose languages.
- Readability issues with very large or “spaghetti” queries.
Some want higher-level query languages that compile to SQL (e.g., PRQL-like) to improve ergonomics without abandoning existing databases.

Parsing and Advent-of-Code specifics

Parsing AoC input in pure SQL is viewed as the gnarliest part; string/regex operations and recursive line-reading are repeatedly mentioned.
Others argue parsing is only hard in the early days, with later problems dominated by algorithmic difficulty.
There’s debate about whether loading inputs into tables first should still count as “pure SQL”; some think that’s acceptable, others see it as cheating the constraint.

Performance and complexity

Several emphasize that thinking purely in “sets” isn’t enough; understanding query plans, indexes, and join strategies remains crucial for performance.
Distributed/analytics engines (BigQuery, Trino, ClickHouse, DuckDB) are mentioned as changing the trade-offs: fewer explicit indexes but new concerns (partitioning, tuple explosion).
Anecdotes show both sides:
- Huge SQL procedures that were slow and later rewritten faster in native code.
- Conversely, compact SQL replacing thousands of lines of application code.

Schema design and long-term maintainability

Multiple comments insist that getting the data model right is the real hard part; once the schema matches the business, SQL logic can be simple and intuitive.
CTEs and views are seen as ways to “unwrap” or re-shape messy schemas into more business-friendly forms without full physical redesign.
There is recurring tension between the elegance of declarative data logic and the practical pain of migrations, refactors, and onboarding new developers.

Related topics