2024-09-25

SQL Tips and Tricks

Overall reception

Many readers find the collection of tips helpful, especially for less-experienced SQL users.
Others see several tips as workarounds for SQL’s design quirks or as “developer convenience” at the cost of clarity or correctness.

Set operations, anti-joins, and NULLs

The EXCEPT trick is considered neat but is criticized as rarely appropriate for production:
- It removes duplicates by default (unless EXCEPT ALL).
- It’s hard for optimizers to push down and can be a late, expensive step.
EXISTS / NOT EXISTS are widely recommended over LEFT JOIN + WHERE NULL or IN/NOT IN for “anti-join” logic, clarity, and performance.
Noted: NOT EXISTS and EXCEPT behave differently in the presence of NULLs.

Formatting, commas, and readability

Major bikeshed: leading commas vs trailing commas vs aligning columns.
- Leading commas enable easy line-commenting and cleaner diffs but many find them visually offensive or unidiomatic.
- Trailing commas are seen as more natural, but many major databases don’t allow them in SELECT lists.
There’s no consensus SQL formatting standard; some recommend using automated formatters (e.g., SQLFluff) to avoid arguments.
Long single lines and dense expressions are criticized; CTEs and intermediate columns are suggested for readability.

SELECT * and column selection

Many argue against SELECT * in production for performance and plan quality, especially on columnar warehouses.
Some defend SELECT * for ad-hoc analysis to save time, acknowledging it is less ideal for recurring queries.
There’s a desire for better composability: reusable column groups, EXCLUDE/EXCEPT syntax for “all but these columns,” or views as a workaround.

WHERE 1=1 and “dummy” predicates

Used by some as a dev-time convenience to always append AND conditions or easily comment/uncomment filters.
Others see it as unnecessary, confusing to newcomers, and potentially encouraging unsafe dynamic SQL construction.
Disagreement over whether it introduces real security/performance risk:
- One side: the pattern often appears in string-built dynamic queries that risk SQL injection and plan bloat.
- Other side: 1=1 itself is harmless; the risk is how queries are constructed, not the literal.

Performance, indexing, and query plans

Strong emphasis on learning each DBMS, reading EXPLAIN plans, and profiling real workloads.
EXISTS often outperforms IN and some JOIN patterns for existence checks.
Some argue “any non-one-off query should avoid table scans” and advocate liberal indexing; others counter that:
- Scans are fine for small tables or analytical queries.
- Indexes have costs (storage, slower inserts) and should be balanced against data size and growth.
Discussion about:
- Indexing computed columns or expressions to make filters sargable.
- GROUP BY often driving index usefulness.
- UNION ALL sometimes being faster than OR-heavy predicates.
- Foreign keys not always auto-creating needed indexes; both sides of a relationship often need explicit indexes.

Subqueries, CTEs, and temp tables

One contributor suggests replacing some JOIN + DISTINCT patterns with scalar subqueries in the SELECT list; others warn:
- Performance varies widely by data shape and indexing.
- Naive use can lead to per-row subquery execution or errors when multiple rows match.
CTEs are not being discouraged by most; they’re recommended for complex logic and readability, though past performance quirks in some databases are noted.
For complex stored procedures:
- One pattern: copy relevant data into temp tables, manipulate, then update base tables in a transaction.
- Others warn this can add overhead and should be driven by profiling, not assumed.
- Advice not to over-break queries into small steps without evidence; give the optimizer a full picture first, then optimize if needed.

Dialect differences and non-standard features

QUALIFY is praised as useful but noted to be non-standard and unsupported in common OLTP engines like Postgres and SQL Server; supported in some warehouses (Snowflake, Databricks, Teradata).
Trailing commas in SELECT and advanced niceties (like SELECT * EXCLUDE/EXCEPT or JOIN USING) exist in some engines (BigQuery, DuckDB, Spanner) but not in mainstream ones.
Some wish for FROM-first / pipe-style syntaxes (like Kusto) to improve query ergonomics.

Comments and metadata

For SQL Server, using /* ... */ is suggested over -- in some contexts, because features like Query Store may strip line breaks, making line-commented code hard to reformat later.
Reminder that multi-line comments can be managed to ease toggling large blocks on and off.

Miscellaneous best practices

Don’t rely on foreign keys alone to create all necessary indexes; verify both referencing and referenced columns are indexed appropriately.
UPDATE statements should avoid writing unchanged rows to reduce log volume and trigger activity, with care around NULL semantics.
Window functions and INTERSECT are briefly highlighted as powerful but underused tools.
Several commenters emphasize: avoid “just in case” helpers in production SQL; prioritize clear intent and maintainability over tiny typing conveniences.

Related topics