Whence '\n'?
Escape sequence \n and compiler bootstrapping
- Central point: where does the mapping
'\n' → byte 10actually live when a compiler is self-hosting and its source only ever says'\n', not10? - Discussion emphasizes that in Rust’s current sources, this mapping is not explicitly numeric; the knowledge is “inherited” transitively from earlier compilers, echoing classic “trusting trust” concerns.
- Some argue this shows the compiler is not rebuildable “from scratch” purely from its current sources; it depends on implicit knowledge embedded in an ancestor binary.
- Others stress that physically there has always been a
0x0Asomewhere in the binary;'\n'is just a human-facing alias.
Related prior work and inspirations
- Several commenters link similar explorations: self-hosting C compiler diaries, classic lectures on compilers inserting hidden behavior, and prior blog posts about escape sequences.
- There’s interest in how different compilers (GCC, Clang) do it; code excerpts show they hardcode numeric values for escapes, with ASCII/EBCDIC branches.
Character escapes in other languages
- OCaml’s decimal
\010-style escapes are discussed; some find them more “primitive” than symbolic escapes like'\n', others note they still rely on numeric parsing that itself must be grounded somewhere. - Backslash-decimal escapes are noted as rare but present in OCaml, Lua, DNS.
- Python’s
\N{UNICODE NAME}is mentioned as a more self-documenting mechanism. - CSV-style
\Nas NULL and Python’s\Nlead to some confusion over the HN title.
ASCII, control codes, and encodings
- Several comments review ASCII control codes, caret notation (
^C,^M), and how control characters are mapped by terminals. - EBCDIC is raised as a complicating case: early C existed on non-ASCII systems, with different newline codes and even separate NEL vs LF.
- There’s debate over whether the article misses this wider encoding context.
Meta reactions, alternatives, and tangents
- Some readers find the article poetic and mind-expanding about compilers as inputs; others dismiss it as a “nothingburger.”
- Speculation on alternatives if ASCII had no escapes: constants like
PHP_EOL, font-level visible control glyphs, or APIs for cursor movement. - Extended humorous tangent on the “USB rule” and connector confusion.
- Brief note that some newer languages deliberately postpone self-hosting, possibly to avoid such bootstrapping subtleties.