2026-05-20

No way to parse integers in C (2022)

State of the C standard library

Many commenters see C’s stdlib, especially string and number functions, as fundamentally unsafe or poorly designed (lack of bounds checking, locale issues, ambiguous errors).
Others argue the library is weak but acceptable if wrapped; “C is not its standard library,” and serious C projects often build their own safer utility layers.
There’s regret that C never got a widely adopted “Boost-like” common library or a single dominant package manager, leading to every shop reinventing utilities.

Integer parsing pitfalls in C

Built-ins like atoi/atol, strtol/strtoul/strtoull, and sscanf are criticized for:
- Silent truncation / overflow, or using max values (e.g., ULONG_MAX) as sentinels.
- Accepting negative input for unsigned parses and wrapping instead of erroring.
- Stopping at first invalid character and returning a partial value (e.g. "123timmy").
- Legacy behaviors like octal interpretation of leading 0.
A concrete example shows strtoull on large negative literals yielding small positive numbers by wraparound, which many consider simply “the wrong answer.”

Workarounds and alternatives

Common patterns: write your own parser, wrap stdlib functions, use return-code-plus-output-parameter APIs, or error via errno, negative codes, or abort.
Some propose pre-validating with regex or string comparison round-trips, though that’s seen as ugly or inefficient.
OpenBSD’s strtonum is noted as better but limited (whitespace handling, only signed long long).
Example custom parsers are shared; even those have subtle UB bugs pointed out (e.g., negating INT64_MIN).

Language design, UB, and portability

Strong criticism of UB: compilers can legally drop checks (e.g., null checks, overflow) leading to surprising crashes.
Debate over C’s “portable assembly” role; some argue its flexible integer sizes undermine true portability, others say efficiency justified the design historically.
One view: standard functions are lexeme scanners optimized for unbounded Unix text streams, not full validators; proper parsing should be a layer above.

Teaching and philosophy

Anecdotes: courses assigning “parse integers correctly” as a semester-long exercise to expose edge cases.
Split perspectives: some say the article nitpicks edge cases; others insist correct, unambiguous parsing is a baseline requirement, not perfectionism.

Related topics