Writing a Rust compiler in C

Project goals and approach

  • Dozer is a Rust compiler written in (portable, minimal) C that targets Cranelift/QBE, aiming to fit into the “from-nothing” bootstrappable toolchain that starts with a tiny C compiler like TinyCC.
  • Main motivation: drastically shorten and simplify the bootstrap path to a modern Rust compiler compared to the current chain (Guile → OCaml → early Rust → many rustc versions).
  • Intended use is not performance or daily development, but as a bootstrap step that can compile the “real” rustc.

Why C, and why not other languages

  • Many argue C is the practical first target because nearly every platform gets a C compiler first, and libraries like Lua are easy to port to minimal C.
  • Alternatives proposed: Java, a proto-Rust subset, Forth, WASM+wasm2c, or decompiling rustc output back to C. Critics note:
    • Java bootstrapping is complex.
    • Forth is ideal conceptually but unpleasant enough to program in that no one follows through at scale.
    • Generated C or blobs (like Zig’s wasm stage1) are seen as unauditable and against bootstrappable-build principles.

Security, trust, and reproducible builds

  • A major driver is reducing exposure to “trusting trust” style compiler backdoors and supply-chain attacks.
  • Bootstrappable Builds ethic: no pre-generated code; everything must be derivable from human-readable source, starting from a tiny binary seed (e.g., hex loader).
  • Some see full-chain auditing as still practically infeasible; others argue “more auditable than today” is already valuable.

Practicality, porting, and skepticism

  • Use cases discussed: porting Rust to new OSes, where current bootstrapping via many rustc versions and LLVM is painful and time-consuming.
  • Counterpoint: for new platforms, cross-compilation often suffices; Dozer specifically targets same-architecture, from-scratch bootstrapping.
  • Critics call the effort mostly aesthetic or futile given Rust’s fast evolution and complex language; supporters value the cleaner bootstrap story and educational value.

Project status and limitations

  • Current codebase is ~5k lines of C. Lexer and parts of the parser exist; typechecking is minimal (e.g., i32 only); macros/modules and robust codegen are missing.
  • It can only handle trivial Rust examples; cannot compile large crates like Tokio yet.
  • No substantial test suite is evident; some commenters want clearer test coverage of supported Rust features.