Parsing protobuf at 2+GB/s: how I learned to love tail calls in C (2021)
Compiler support and tail-call attributes
[[musttail]]in Clang/LLVM guarantees tail calls or emits an error; GCC is adding a compatible feature. This is crucial for designs where stack growth would be fatal.- Many ABIs make tail calls hard; compilers may alter call sequences (e.g., no-PLT calls) to honor
musttail. Complete Scheme-style guarantees are seen as unrealistic. - New calling conventions like
preserve_none(and olderpreserve_all) are used to reduce register spills in tail-call-heavy interpreters;musttail + preserve_noneis highlighted as a powerful combo. - There is a C23/C-standard attribute system, plus
-foptimize-sibling-calls, but these do not provide the “fail if not tail-called” guarantee. - A C proposal (“
return goto expr;”) would add standardized tail calls with explicit semantics around object lifetimes; some argue it’s easier to implement than[[musttail]].
Interpreter design and performance
- Traditional VM style is a loop with a big
switch, or computed gotos. This is readable and portable but leads to:- Poor branch prediction (one indirect branch for all opcodes).
- Huge, hard-to-optimize functions and fragile register allocation.
- Tail-call threaded interpreters (one function per opcode, calling the next via
musttail) give:- One indirect branch per opcode, which matches branch predictors better (Markov-chain-like).
- Smaller functions with better register allocation and fewer spills.
- Trampolines (returning a function pointer to an outer loop) are portable but likely slower and harder on branch prediction.
- Fallback/slow paths and error handling can force stack frames and spills; placing them in separate functions and sometimes tail-calling them is key both for performance and to avoid unbounded stack growth.
Portability, language evolution, and extensions
- Using
musttailcreates a non-standard C dialect; some see this as acceptable compared to writing assembly. Others emphasize the risk of silent stack overflows when attributes are ignored. - Common practice is to hide attributes behind macros so code compiles on compilers that don’t support them, trading hard guarantees for “best effort.”
- There is debate over language extensions: some see them as necessary evolution (and common in real-world C), others as ecosystem “sprawl” reminiscent of browser-era vendor quirks.
- Some welcome new energy in C (e.g., C23), while others worry C will accumulate C++-style complexity if features like function literals and more are added without restraint.
Safety and tail calls in other languages
- Scheme’s guaranteed tail-call optimization drove techniques like CPS and trampolines; lack of TCO in targets (C, JavaScript) hurts performance.
- Rust has explored a
becomekeyword to guarantee tail calls but drops/destructors make most apparent tail calls invalid in practice. - In C++ and Rust, destructors and cleanup attributes generally block tail calls;
musttailis disallowed in such contexts and when exceptions/destructors are active. - There is a strong subthread arguing C’s unsafety is no longer acceptable for many domains, countered by claims that C remains foundational and can be made safer with tooling.