Parsing protobuf at 2+GB/s: how I learned to love tail calls in C (2021)

Compiler support and tail-call attributes

  • [[musttail]] in Clang/LLVM guarantees tail calls or emits an error; GCC is adding a compatible feature. This is crucial for designs where stack growth would be fatal.
  • Many ABIs make tail calls hard; compilers may alter call sequences (e.g., no-PLT calls) to honor musttail. Complete Scheme-style guarantees are seen as unrealistic.
  • New calling conventions like preserve_none (and older preserve_all) are used to reduce register spills in tail-call-heavy interpreters; musttail + preserve_none is highlighted as a powerful combo.
  • There is a C23/C-standard attribute system, plus -foptimize-sibling-calls, but these do not provide the “fail if not tail-called” guarantee.
  • A C proposal (“return goto expr;”) would add standardized tail calls with explicit semantics around object lifetimes; some argue it’s easier to implement than [[musttail]].

Interpreter design and performance

  • Traditional VM style is a loop with a big switch, or computed gotos. This is readable and portable but leads to:
    • Poor branch prediction (one indirect branch for all opcodes).
    • Huge, hard-to-optimize functions and fragile register allocation.
  • Tail-call threaded interpreters (one function per opcode, calling the next via musttail) give:
    • One indirect branch per opcode, which matches branch predictors better (Markov-chain-like).
    • Smaller functions with better register allocation and fewer spills.
  • Trampolines (returning a function pointer to an outer loop) are portable but likely slower and harder on branch prediction.
  • Fallback/slow paths and error handling can force stack frames and spills; placing them in separate functions and sometimes tail-calling them is key both for performance and to avoid unbounded stack growth.

Portability, language evolution, and extensions

  • Using musttail creates a non-standard C dialect; some see this as acceptable compared to writing assembly. Others emphasize the risk of silent stack overflows when attributes are ignored.
  • Common practice is to hide attributes behind macros so code compiles on compilers that don’t support them, trading hard guarantees for “best effort.”
  • There is debate over language extensions: some see them as necessary evolution (and common in real-world C), others as ecosystem “sprawl” reminiscent of browser-era vendor quirks.
  • Some welcome new energy in C (e.g., C23), while others worry C will accumulate C++-style complexity if features like function literals and more are added without restraint.

Safety and tail calls in other languages

  • Scheme’s guaranteed tail-call optimization drove techniques like CPS and trampolines; lack of TCO in targets (C, JavaScript) hurts performance.
  • Rust has explored a become keyword to guarantee tail calls but drops/destructors make most apparent tail calls invalid in practice.
  • In C++ and Rust, destructors and cleanup attributes generally block tail calls; musttail is disallowed in such contexts and when exceptions/destructors are active.
  • There is a strong subthread arguing C’s unsafety is no longer acceptable for many domains, countered by claims that C remains foundational and can be made safer with tooling.