C++ proposal: There are exactly 8 bits in a byte
Motivation and Practical Reality
- Many commenters note that virtually all modern systems already use 8‑bit bytes, and large amounts of C/C++ code implicitly assume this.
- Several argue that formalizing
CHAR_BIT == 8simply matches de‑facto reality and improves developer expectations and portability in practice. - Others point out they only recently learned that
bytewas not guaranteed to be 8 bits, and see this as a surprising and confusing corner of the standard.
Legacy and Exotic Architectures
- Multiple examples of non‑8‑bit “bytes” are discussed: 6‑, 7‑, 9‑, 10‑, 12‑, 16‑, 24‑, 32‑, and 36‑bit addressed units (UNIVAC, PDP‑10, various DSPs, historical mainframes, retro consoles).
- Some of these systems still exist as emulated mainframes or niche DSPs, but often don’t track modern C++ standards or run mostly non‑C++ code.
- A few participants enjoy hardware diversity and are sad to see standards drop support for such machines, even if they’re niche.
DSPs and Word‑Addressed Systems
- DSP chips with 16‑ or 32‑bit addressable units are repeatedly cited as the only current plausible targets that conflict with
CHAR_BIT == 8. - Opinions diverge: some say these platforms are specialized enough that non‑standard or frozen C/C++ dialects are acceptable; others insist they remain “real” systems that will lose conformance.
Portability vs. Simplification
- Supporters frame the change as analogous to mandating two’s complement integers: reflect how “every real computer works now,” and reduce undefined edge cases.
- Critics worry about excluding legitimate architectures and question the concrete benefit, since
CHAR_BITwould still exist and much code already assumes 8. - Some see this as part of C++’s tension between maximal portability and acknowledging the practical hardware baseline.
Terminology and Related Debates
- Several comments clarify: in C/C++ a “byte” is the size of
char; “octet” is the unambiguously 8‑bit term, especially in networking. - There is side discussion about the fuzziness and obsolescence of “word” size, and about whether higher‑level code should think in bits, bytes, or fixed‑width integer types.