2025-03-13

Recursion kills: The story behind CVE-2024-8176 in libexpat

Scope of the vulnerability

Thread centers on CVE-2024-8176 in libexpat: a recursion-based stack overflow triggered by maliciously nested XML entities, leading primarily to denial of service.
Several commenters stress that this is about stack exhaustion, not general “running out of memory”, and that it’s particularly relevant when parsing untrusted input.

“Recursion kills” – is recursion the problem?

Many argue the article’s “recursion kills” framing is overly broad or alarmist; they equate it to blaming malloc for heap bugs.
Consensus among these is that the real issue is unbounded resource usage, not recursion itself: recursion is fine when inputs or depth are bounded or well-controlled.
A minority take a hard line: “nobody should use recursion in production”, especially in C, due to small default stacks and cross‑platform variability.
Others counter that some problems (trees, grammars) are naturally recursive and much clearer when expressed recursively; banning recursion pushes complexity elsewhere.

Stack vs heap, tail calls, and transformations

Repeated clarification that stack is just memory; exceeding stack is one form of “running out of space”.
Tail-call optimization can turn certain recursive patterns into loops that use constant stack; functional languages often lean on this.
Debate over whether “any” recursion can be made tail-recursive: some point to CPS and continuations; others note that non‑trivial algorithms still need somewhere to store state, so you only trade stack space for heap space.
Several note that heap exhaustion is generally less directly exploitable than stack overflow, but can still cause DoS.

Mitigating recursion-based DoS in parsers

Proposed fixes: track recursion depth with a context parameter and fail when exceeding a limit; or use explicit, heap-backed stacks instead of the call stack.
Objections: maximum safe depth varies by platform; any fixed limit creates false positives and arbitrary restrictions on “legal but deeply nested” documents.
Some point out that in complex, indirectly recursive code, threading a depth parameter everywhere quickly becomes messy.
Others argue that robust parsers for hostile input should have multiple configurable limits (nesting depth, payload size, CPU time) and often disable “fancy” XML features.

Tooling, languages, and runtimes

clang-tidy’s misc-no-recursion check is mentioned as useful but not sound; recursion through function pointers is undecidable in general.
Discussion of segmented/growable stacks, guard pages, and compiler protections (e.g., Stack Clash mitigations): these can turn some stack overflows into clean process termination, but that’s still a DoS in many contexts.
Embedded and safety‑critical systems often forbid recursion outright and require static stack bounds; others accept recursion but rely on OS limits and process isolation.

Ecosystem & maintenance concerns

Commenters highlight a “tragedy of the commons” / free-rider dynamic: many companies depend on libexpat but very few contribute money or engineering help.
Some praise the maintainer’s openness in asking for help and documenting the fix; others link related research on input-driven recursion bugs found via CodeQL.

Related topics