The Journey Before main()
Symbol tables and binary size
- Comparison of
readelfoutput shows a statically linked musl “hello world” with thousands of symbols vs ~36 when linked to glibc dynamically. - Commenters attribute this to static linking and musl’s design, not just RISC‑V or build flags.
Avoiding the C standard library / direct syscalls
- Some enjoy writing C programs that bypass libc and call Linux syscalls directly, or use “nolibc” headers.
- Others argue this is fun but impractical: loses portability, requires re‑implementing basics (string/number conversion, allocators), and ties you to kernel ABIs.
- Several note that on non‑Linux systems (BSDs, Windows) syscall ABIs are not stable, so libc is required.
Windows vs Linux APIs and linking models
- On Windows, typical apps call Win32 APIs (Kernel32, User32, GDI32, etc.), not raw syscalls; ntdll/Win32U wraps actual syscalls whose numbers change across versions.
- Discussion of CRT‑free Win32: you can avoid the C runtime but still rely on system DLLs; some show hacks that call into kernel32 without an import table, but these are unsupported and AV‑unfriendly.
- Debate over whether “Windows support is a requirement”: some insist serious (“adult”) projects should plan for it; others say many server and embedded systems will never run on Windows, so Linux‑only is fine.
- Long comparison of linking models:
- Windows: import libraries, DLLs treated more as black boxes, fewer global shared libs, more stable system DLL ABIs.
- Linux/GNU: direct linking to
.sofiles, global library paths, versioned glibc symbols; praised for flexibility, criticized as “ABI/dependency hell” that motivated Docker and per‑app bundling.
Glibc, ABI stability, and libc alternatives
- Complaints that glibc changes can break binaries (example: Steam/games impacted by ELF/exec‑on‑stack changes, later reverted).
- View that on GNU/Linux, glibc effectively is half the OS: the dynamic loader, NSS, DNS behavior, and many facilities are in glibc, not the kernel.
- Some want glibc split conceptually into three parts: syscall wrappers, dynamic loader, and higher‑level C library, to isolate ABI.
- musl + static linking is proposed as a simple, portable option for non‑GUI tools, at the cost of size and some performance.
- Others argue that replacing glibc amounts to building a different OS stack and incurs substantial effort.
ELF loading, dynamic linkers, and loaders
- Clarification that on Linux the kernel only maps PT_LOAD segments and then jumps to the ELF interpreter given by PT_INTERP.
- The user‑space dynamic linker (e.g., glibc’s
ld.so) performs relocations, loads all shared objects viammap/mprotect, handlesdlopen, auditing, preloads, etc. - This is compared to shebang handling: the loader is akin to a binary “interpreter”.
- People note how complex loaders really are (dependency graph resolution, init/fini, audit features), which explains why there aren’t many alternative loaders and why loaders are tightly coupled to their libc.
- Discussion that the kernel ignores ELF sections entirely; it only cares about program headers (segments). Embedding extra sections doesn’t make the kernel map them unless PT_LOAD entries are updated.
Shebangs, binfmt, and debugging issues
- A bug story: a Java program got ENOENT while executing an existing script because the script’s shebang interpreter path didn’t exist on the remote host; Java surfaced just “No such file or directory.”
- Advice: use
straceto see the failingexecve; note that shebang support itself depends on theCONFIG_BINFMT_SCRIPTkernel option. - Mention of
binfmt_miscfor associating arbitrary magic with interpreters (used for Wine, qemu user‑mode, etc.).
Direct syscalls vs system libraries for higher‑level facilities
- Some argue that directly using kernel syscalls is ideal for minimalism and clarity.
- Others counter that newer subsystems (ALSA, DRM, GPU drivers) are intentionally fronted by user‑space libraries; this makes interception, portability, 32‑/64‑bit compatibility, and ABI evolution much easier.
- This is cast as a “Windows‑style” design (rich system libraries over a smaller syscall surface) being preferable for many real programs.
Memory layout and teaching stack/heap
- A university instructor points out that most textbooks draw virtual memory with “higher addresses at the top”, which conflicts with how editors and
/proc/<pid>/mapspresent things (addresses increase as you go down). - They argue that drawing low addresses at the top and high at the bottom matches real tooling, makes it easier to see: text/heap at lower addresses, stack near the top of the address space; heap grows to higher addresses, stack pointer moves toward lower addresses.
- Others push back, noting long‑standing conventions (“stack grows down”) and other mental models; some prefer horizontal layouts (low on left, high on right).
- There’s also side discussion about little‑endian representation vs how numeric addresses are written, and how that can confuse beginners.
“Before main()” and freestanding tricks
- Several comments focus on code that runs before
main:- Use of custom
_startthat simply passesargc/argv/envp/auxvdirectly to a “main” without libc initialization. - Writing fully freestanding Linux programs with just syscalls and custom utilities (e.g., own
printf/malloc), including an experimental Lisp interpreter and user space built on raw syscalls.
- Use of custom
- Note that
_startis just an arbitrary entry symbol the linker uses; the ELF header’s entry address is what truly matters. - Others mention language‑runtime behavior and static initializers: lots of code can execute before
main, which can be exploited or can cause subtle crashes (including the linked SIGFPE‑before‑main example). - Microcontroller programmers relate analogous work on bare‑metal devices (e.g., PIC16), where you hand‑configure stack pointers, timers, and memory, with no OS or libc at all.