The Journey Before main()

Symbol tables and binary size

  • Comparison of readelf output shows a statically linked musl “hello world” with thousands of symbols vs ~36 when linked to glibc dynamically.
  • Commenters attribute this to static linking and musl’s design, not just RISC‑V or build flags.

Avoiding the C standard library / direct syscalls

  • Some enjoy writing C programs that bypass libc and call Linux syscalls directly, or use “nolibc” headers.
  • Others argue this is fun but impractical: loses portability, requires re‑implementing basics (string/number conversion, allocators), and ties you to kernel ABIs.
  • Several note that on non‑Linux systems (BSDs, Windows) syscall ABIs are not stable, so libc is required.

Windows vs Linux APIs and linking models

  • On Windows, typical apps call Win32 APIs (Kernel32, User32, GDI32, etc.), not raw syscalls; ntdll/Win32U wraps actual syscalls whose numbers change across versions.
  • Discussion of CRT‑free Win32: you can avoid the C runtime but still rely on system DLLs; some show hacks that call into kernel32 without an import table, but these are unsupported and AV‑unfriendly.
  • Debate over whether “Windows support is a requirement”: some insist serious (“adult”) projects should plan for it; others say many server and embedded systems will never run on Windows, so Linux‑only is fine.
  • Long comparison of linking models:
    • Windows: import libraries, DLLs treated more as black boxes, fewer global shared libs, more stable system DLL ABIs.
    • Linux/GNU: direct linking to .so files, global library paths, versioned glibc symbols; praised for flexibility, criticized as “ABI/dependency hell” that motivated Docker and per‑app bundling.

Glibc, ABI stability, and libc alternatives

  • Complaints that glibc changes can break binaries (example: Steam/games impacted by ELF/exec‑on‑stack changes, later reverted).
  • View that on GNU/Linux, glibc effectively is half the OS: the dynamic loader, NSS, DNS behavior, and many facilities are in glibc, not the kernel.
  • Some want glibc split conceptually into three parts: syscall wrappers, dynamic loader, and higher‑level C library, to isolate ABI.
  • musl + static linking is proposed as a simple, portable option for non‑GUI tools, at the cost of size and some performance.
  • Others argue that replacing glibc amounts to building a different OS stack and incurs substantial effort.

ELF loading, dynamic linkers, and loaders

  • Clarification that on Linux the kernel only maps PT_LOAD segments and then jumps to the ELF interpreter given by PT_INTERP.
  • The user‑space dynamic linker (e.g., glibc’s ld.so) performs relocations, loads all shared objects via mmap/mprotect, handles dlopen, auditing, preloads, etc.
  • This is compared to shebang handling: the loader is akin to a binary “interpreter”.
  • People note how complex loaders really are (dependency graph resolution, init/fini, audit features), which explains why there aren’t many alternative loaders and why loaders are tightly coupled to their libc.
  • Discussion that the kernel ignores ELF sections entirely; it only cares about program headers (segments). Embedding extra sections doesn’t make the kernel map them unless PT_LOAD entries are updated.

Shebangs, binfmt, and debugging issues

  • A bug story: a Java program got ENOENT while executing an existing script because the script’s shebang interpreter path didn’t exist on the remote host; Java surfaced just “No such file or directory.”
  • Advice: use strace to see the failing execve; note that shebang support itself depends on the CONFIG_BINFMT_SCRIPT kernel option.
  • Mention of binfmt_misc for associating arbitrary magic with interpreters (used for Wine, qemu user‑mode, etc.).

Direct syscalls vs system libraries for higher‑level facilities

  • Some argue that directly using kernel syscalls is ideal for minimalism and clarity.
  • Others counter that newer subsystems (ALSA, DRM, GPU drivers) are intentionally fronted by user‑space libraries; this makes interception, portability, 32‑/64‑bit compatibility, and ABI evolution much easier.
  • This is cast as a “Windows‑style” design (rich system libraries over a smaller syscall surface) being preferable for many real programs.

Memory layout and teaching stack/heap

  • A university instructor points out that most textbooks draw virtual memory with “higher addresses at the top”, which conflicts with how editors and /proc/<pid>/maps present things (addresses increase as you go down).
  • They argue that drawing low addresses at the top and high at the bottom matches real tooling, makes it easier to see: text/heap at lower addresses, stack near the top of the address space; heap grows to higher addresses, stack pointer moves toward lower addresses.
  • Others push back, noting long‑standing conventions (“stack grows down”) and other mental models; some prefer horizontal layouts (low on left, high on right).
  • There’s also side discussion about little‑endian representation vs how numeric addresses are written, and how that can confuse beginners.

“Before main()” and freestanding tricks

  • Several comments focus on code that runs before main:
    • Use of custom _start that simply passes argc/argv/envp/auxv directly to a “main” without libc initialization.
    • Writing fully freestanding Linux programs with just syscalls and custom utilities (e.g., own printf/malloc), including an experimental Lisp interpreter and user space built on raw syscalls.
  • Note that _start is just an arbitrary entry symbol the linker uses; the ELF header’s entry address is what truly matters.
  • Others mention language‑runtime behavior and static initializers: lots of code can execute before main, which can be exploited or can cause subtle crashes (including the linked SIGFPE‑before‑main example).
  • Microcontroller programmers relate analogous work on bare‑metal devices (e.g., PIC16), where you hand‑configure stack pointers, timers, and memory, with no OS or libc at all.