Why are QR Codes with capital letters smaller than QR codes with lower case?
Core technical explanation
- QR has multiple modes: numeric, alphanumeric, byte, kanji.
- Alphanumeric mode supports digits, space and a 45‑character uppercase-heavy set, encoding 2 chars in 11 bits (~5.5 bits/char).
- Lowercase letters are not in this set, so any lowercase forces byte mode (8 bits/char), producing larger codes.
- QR messages can be split into segments with different modes, but many generators don’t optimize this.
Alternative encodings & efficiency debates
- Data Matrix and some 1D codes (e.g., Code 128) can shift modes inline; QR instead uses explicit segments.
- Some suggest using general compression (Huffman/entropy coding) instead of fixed modes, but others point out you then need shared probability tables, which becomes equivalent to predefined modes.
- Base45 (RFC 9285) is discussed for packing binary data into QR alphanumeric; it has small overhead vs pure byte mode and avoids big‑integer math.
- Others argue numeric or carefully chosen alphabets (via “base‑x”) can be more efficient or simpler.
- There’s a detailed back-and-forth about how to measure “efficiency” (fraction of bit space used vs information-theoretic bits per symbol) and whether base45 or QR alphanumeric is more efficient under realistic constraints.
Tools and visual explainers
- Several links to step‑by‑step and visual QR explainers are shared, with praise for ones that let you input your own data and see every encoding step.
- A video of manually constructing a QR code on a Go board is mentioned as a nice illustration.
Uppercase URLs & compatibility
- Question: is it safe to uppercase the URL scheme (“HTTPS”)?
- Various people report practical success on major phones, but some iOS quirks when omitting the scheme and using non‑.com TLDs.
- Common strategy: keep
https://lowercase, uppercase the domain (and sometimes path), and rely on mode segmentation for size benefits. - Standards say schemes are case-insensitive but canonically lowercase; real-world scanners generally accept uppercase.
QR bloat, tracking, and usability
- Many real-world QR codes are far larger than necessary because they embed long tracking URLs and query strings; this makes them harder to scan but easier to implement and track.
- Some argue a short domain + simple identifier (possibly uppercase-only) should suffice, with redirects handling complexity behind the scenes.
Human vs machine-readable concerns
- Several commenters dislike the post‑pandemic trend of QR‑only menus and ordering: dependence on phones, poor accessibility, fragile scanning, and hidden tracking.
- Preference for also printing short, human-readable URLs or recognizable names, akin to text under 1D barcodes.
- Discussion of using pixel fonts or OCR‑friendly fonts (OCR‑A/OCR‑B), but others note modern OCR and that barcodes still win for robustness and structured data.
Historical/contextual note
- One commenter notes QR was invented in Japan, where Latin letters—when used—are typically uppercase only, so an uppercase+numeric “alphanumeric mode” aligns with local conventions and helps explain the design choice.