2025-02-20

TinyCompiler: A compiler in a week-end

Simplicity and educational value

Many commenters like that TinyCompiler is small, dependency‑free, and hand‑written (no LLVM, yacc, etc.).
Seen as a modern equivalent to classic “build a compiler” texts: enough to demystify compilers and get people hooked.
Several say this is the kind of resource they wish they’d had a decade ago, especially for targeting unusual or archaic hardware.

Prior art and alternative tiny compilers

Older small compilers are mentioned (e.g., Crenshaw’s series, tiny C implementations, Python compilers) as similar learning resources.
Some highlight online parser playgrounds and small Python compilers as additional examples.

Learning path: interpreter first, then backend

One camp recommends starting with an interpreter, then moving to LLVM or another backend to avoid early roadblocks (dominance, SSA, CFG analysis).
Others argue that writing a simple backend to assembly is itself educational and sometimes necessary (e.g., for niche targets without LLVM).

Difficulty and the “weekend” claim

Some doubt you can “understand compilers” in a weekend; others argue you can grasp the core concepts in a day with a tiny language.
Distinction is made between toy compilers and production ones; the latter are hard because of language complexity and performance goals.
The author clarifies “week-end” refers to how long this particular project took, not to mastering compiler theory.

Parsing, expressions, and “hard parts”

Mixed views on what’s hardest:
- For beginners: infix expression parsing and precedence; Pratt parsers and shunting-yard are mentioned.
- Others say parsing is easy; function calls, calling conventions, register allocation, SSA, and optimization are the real challenges.
Discussion digs into SSA construction strategies (classic dominance‑based, maximal SSA + DCE, alternative algorithms) and mem2reg‑style passes.

What counts as a compiler

Debate over whether an AST + interpreter (or bytecode interpreter) is “really” a compiler.
One side insists compilation implies nontrivial transformation and code generation; others argue even naive syntax‑directed bytecode/machine‑code generation is compilation.

Backends and IR choices

LLVM is seen as powerful but heavy; some want lighter backends.
QBE is praised for performance (around a large fraction of GCC speed in one person’s benchmarks) but criticized as hard to extend (minimal comments, terse style).
Alternatives discussed: libfirm, Cranelift, LuaJIT; trade‑offs in size, complexity, and hackability.
Concerns about Linux’s reliance on GCC extensions making alternative compilers harder to adopt.

Language design and “wend”

Several appreciate how “wend” is minimal yet expressive enough to run nontrivial demos (e.g., a fire effect).
This style is likened to teaching languages like Pascal or Python, which began as pedagogical tools but proved practical.
Side discussion contrasts simple, C‑like teaching languages with more complex modern languages (C++, Rust), arguing complexity affects compiler ecosystem viability.

Resources and courses

Commenters exchange book recommendations: some criticize classic theory‑heavy texts (e.g., the “Dragon Book”) as poor first introductions.
Highly recommended modern resources include practitioner‑oriented books and online serials on interpreters/compilers, plus university courses that start from codegen and work backwards.
Several paid and free courses are mentioned positively (e.g., week‑long compiler courses, video series); some wish they spent more time on emitting assembly rather than offloading to LLVM.

Off‑topic note about symbolism

One commenter points out that a T‑shirt shown in the article might be misread as a hate symbol in some contexts.
The author acknowledges this, removes the reference, and expresses a desire not to inadvertently offend.

Related topics