2026-04-15

Want to write a compiler? Just read these two papers (2008)

Learning Resources for Compilers

Many recommend approachable, project-focused material over dense theory:
- Crenshaw’s “Let’s Build a Compiler” and similar “tiny compiler” series.
- “Crafting Interpreters” is widely praised, though some wish for a sequel covering types, optimization, and linking.
- Incremental/educational texts: Ghuloum’s “An Incremental Approach to Compiler Construction,” “Essentials of Compilation,” a short compiler book by Wirth, and a small C-compiler book.
- Courses and video series: nand2tetris, a well-regarded Stanford compilers course, CS6120, and other online lectures.
Several links to freely available PDFs and archived books/papers (nanopass, Wirth, Bornat, etc.).

Difficulty and Course Experiences

Compiler courses are repeatedly described as very hard but often rewarding.
Some found them purely painful, while others say teacher quality made the biggest difference.
There is disagreement over whether writing a simple compiler is “not that difficult” or beyond most CS graduates without strong guidance.

Parsing, Frontends, and Syntax

Strong debate on parsing approaches:
- Some favor parser combinators and recursive descent for clarity and better error messages.
- Others argue traditional lexer/parser splits and parser generators are still valuable, especially for understanding grammar design.
General sense that modern educational resources de-emphasize deep parsing theory compared to the “Dragon Book.”

Nanopass and Incremental Design

Nanopass is seen as underappreciated: the key idea is many small passes with explicit input/output languages and invariants.
This structure is argued to make compilers easier to extend and debug than monolithic designs.

Backends, IR, and Modern Concerns

Thread highlights the importance of SSA, data-flow analysis, and IR-based backends; some feel older texts under-cover these.
Using LLVM IR as a target is suggested as a practical way to avoid backend complexity, at the cost of learning less about codegen.

AI-Generated Toy Compilers

One side claims small LLM-generated compilers are great for learning by tinkering and seeing all phases in minimal code.
Others criticize such projects as buggy, poorly tested, and misleading for beginners, recommending safer targets (e.g., high-level languages) if using AI at all.

Related topics