2026-05-02

Unsigned sizes: A five year mistake

Signed vs unsigned for sizes and indices

Many argue sizes/indices should be signed: subtraction is common, negative results should be representable or at least clearly erroneous.
With unsigned indices, underflow silently wraps, making bugs hard to spot (e.g., reverse loops, “index before this one”).
Others strongly prefer unsigned for sizes: sizes are inherently non-negative, and using signed wastes half the range and can limit addressable space or force larger types.
Some see signed vs unsigned as less important than having good bounds checks and clear overflow semantics.

Unsigned semantics: modular vs “non‑negative”

Several comments stress that “unsigned” in C-like languages means modular arithmetic (values are residues mod 2ⁿ), not “cannot be negative”.
This mismatch between intuition (“non-negative”) and reality (wraparound) is cited as a core footgun.
Some suggest languages should add true non‑negative integer types distinct from modular/bitfield types.

Overflow, undefined behavior, and diagnostics

Signed overflow being undefined in C/C++ is seen as both a feature (sanitizers/traps can catch bugs) and a hazard (optimizers can delete or mangle code).
Unsigned wraparound is well-defined but therefore harder to detect as an error automatically.
Some rely on sanitizers and strict warnings (-Wsign-conversion, traps on overflow) to make signed arithmetic safer than unsigned.

Language design comparisons

C/C++: criticized for dangerous implicit conversions and unsigned defaults for sizes; defended as pragmatic and close to hardware.
Rust: uses unsigned for sizes but forces explicit casts and has safe wrappers (checked_*, wrapping_*, saturating_*), reducing silent bugs.
Go: len is signed; arithmetic is defined; bounds checks apply regardless of index type, making signed vs unsigned largely a non-issue.
Zig: distinguishes wrapping vs non-wrapping operations and enforces explicitness on modulo/overflow behavior.
Java: mostly signed primitives; unsigned exposed via APIs; some miss native unsigned for bit-level work.
Pascal/Ada: cited as examples with range types and true non-negative integers.

Use cases favoring unsigned/modular types

Low-level and performance-sensitive domains (HPC graphics, simulation, bioinformatics, compression, succinct data structures) use unsigned to:
- Exploit full bit-width for indices and counters.
- Map cleanly onto bit patterns and modular algebra.
- Implement ring buffers, sequence numbers, and bit packing.
Counterpoint: sizes of data structures and general program logic are argued to rarely need full unsigned range, and logic errors above signed max are common.

Higher-level abstractions for indexing

Some suggest indices should be treated as opaque handles or custom types, not raw integers.
Proposals include per-array index types, opaque structs that callers cannot do math on, or iterator-based patterns to avoid numeric indexing altogether.

Debate over C’s historical intent and consistency

There is disagreement over whether C’s unsigned was always specified as modular and whether later standards introduced inconsistencies with sizeof and conversions.
One side claims the standard is internally inconsistent; another cites early documentation that explicitly defines modulo-2ⁿ behavior.

Miscellaneous

A few comments criticize the blog’s light-grey-on-white typography; others note it becomes darker when JavaScript runs.

Related topics