How GCC and Clang handle statically known undefined behaviour
Nature of Undefined Behavior (UB)
- UB is framed as a “promise” from programmer to compiler: certain situations won’t occur, enabling aggressive optimization.
- Some argue most real-world C/C++ code inevitably contains UB; the real issue is how compilers exploit it, not its mere existence.
- Others insist developers must treat any discovered UB as a bug to be removed, not something to “work around” in the optimizer.
Compiler Optimizations and “Time Travel”
- Many examples show optimizers deleting sanity checks or moving crashes earlier once UB is provable (e.g., divide-by-zero, null deref).
- Disagreement over whether UB on any path invalidates the whole program:
- One view: only the actually executed path is affected.
- Another (especially for C++): UB can “time travel” within that path, pulling crashes before prior side effects.
- For C, C23 clarification explicitly says earlier observable behavior must remain as specified, even if later UB occurs.
Diagnostics, Sanitizers, and Static UB
- UBSan and related tools help but are incomplete, can be costly, and aren’t generally for production.
- Some want compilers to treat statically known UB as hard errors rather than “unreachable”, to avoid miscompilations.
- Others note practical difficulties: UB is often only visible after deep IR optimizations; mapping that back to source with useful errors is hard.
Language Standards and Committee Views
- There is active work in C’s UB study group and new wording in C23 that forbids “retroactive” effects of UB.
- Strong debate over whether standards bodies or compiler vendors are to blame for UB being used as a license for extreme optimizations.
Systems / Embedded and Low-Level Programming
- Several complain that modern ISO C/C++’s UB model makes low-level tasks (JITs, direct hardware access, stack walking) non-portable or fragile.
- Counterpoint: much of this behavior is implementation-defined rather than strictly UB, and can be supported by specific compilers/ABIs.
Signed Integer Overflow and Performance
- Signed overflow as UB enables optimizations and overflow-trapping modes; proponents find this valuable.
- Critics argue it breaks intuitive arithmetic, hurts safety, and that performance wins are minor compared to correctness.
- There is discussion of using unsigned or library types to get defined wrapping/trapping semantics, albeit with worse ergonomics.
Proposed Directions
- Treat static UB as compile-time errors.
- Reduce “00UB” (UB from simple integer ops) and add clearer tiers (e.g., defined, implementation-defined, trapped).
- Improve diagnostics that reuse optimization information without making them flag-dependent.