The Cost of a Closure in C

Closure and “state” mechanisms in C

  • Several comments explore adding a “stateful function” concept or state keyword: compiler-generated structs to hold per-call-state, implicitly threaded through calls (similar to async/await state machines).
  • Concerns: where the state struct lives (stack/heap/data segment), lifetime if callbacks outlive the defining frame, copying/moving state safely, recursion, and interaction with separate compilation.
  • Many argue the “C way” is explicit context management: user-provided structs and function-pointer+context patterns, akin to ucontext or C APIs that take void *userdata.

Performance, inlining, and benchmarks

  • Multiple commenters say the benchmark mostly reflects inlining and devirtualization behavior, not inherent closure cost.
  • View that, with a strong optimizer, non-allocating lambda/closure forms should compile to near-identical code; allocation-heavy forms (e.g., naive std::function) are the pathological cases.
  • Discussion of compilers’ difficulty eliminating std::function overhead vs std::function_ref, and how recursive patterns like Man‑or‑Boy amplify copying and heap allocation costs.
  • Some dismiss microbenchmarks as weak evidence for language design.

Implementation strategies: nested functions, trampolines, wide pointers

  • GNU nested functions currently use stack trampolines and require executable stacks; this is criticized for security and performance. New options like -ftrampoline-impl=heap trade to executable heap and lifetime management issues.
  • Alternatives proposed:
    • Use “fat pointers” (function pointer + environment pointer), like C++ lambdas or bound-member pointers.
    • Nested-function syntax but fat-pointer semantics, avoiding executable trampolines.
    • Function-descriptor ABIs or special-cased nested functions (static/register) to avoid trampolines.

Thread locals and dynamic scoping–like tricks

  • Thread-local variables are suggested as a way to smuggle extra state into callbacks (e.g., wrapping qsort).
  • Counterexamples show this breaks when closures are stored and invoked later, or when nested/recursive use arises; it becomes effectively dynamic scoping with reentrancy hazards.

Interoperability and other languages

  • Rust, C++, Blocks, Borland-style __closure, and Raku’s state variables are cited as prior art.
  • Rust’s capturing closures need trampolines or fat-pointer patterns to interoperate with thin C function pointers; some dislike the required unsafe or indirection.
  • Some want C to standardize first-class “wide function pointers” and closure support; others argue closures belong in higher-level languages.

C philosophy and complexity

  • There is a split between those who see closures as making common patterns safer and those who fear they erode C’s minimal, explicit model.
  • Skeptics emphasize that many programmers already struggle with existing C rules and worry new implicit mechanisms (closures, defer, etc.) will push C toward C++-like complexity.