2025-12-29

Huge Binaries

Binary sizes and where the bloat comes from

25 GiB+ binaries are described, with people noting that most of that can be debug info rather than executable code.
C++ debug symbols are highlighted as a huge contributor: templates, type info, local variable locations, line mappings, and multiple specializations generate massive DWARF sections.
Some extreme cases: LLVM-dependent builds >30 GB, 25 GB stripped binaries, and games or applications embedding large assets or model weights inside the executable.

Static vs dynamic linking at large scale

Large shops favor static (or mostly static) binaries for:
- Startup speed and reduced dynamic loader overhead (PLT/GOT, symbol interposition).
- Easier profiling, crashdump analysis, and fleet-wide tooling that assumes a single monolithic binary.
- Binary provenance and security guarantees: “what’s running is exactly what we built”.
Reasons given for avoiding dynamic libraries:
- ABI instability and header-only templates make reusable .so’s hard in big C++ monorepos.
- Different builds use different library versions, defeating sharing.
- Historical ld.so performance issues with many shared objects.
- Operational weirdness at scale (e.g., bit flips or corruption making a shared library “poisonous” for all processes on a node).
Skeptics point out that huge cloud providers successfully use dynamic linking and managed runtimes, questioning whether static linking is truly required for scale.

Debug info handling and tooling

Detached debug files, split DWARF (-gsplit-dwarf), and compressed debug sections are widely known and used, but tooling is seen as clumsy.
Several note that debuginfo sections don’t affect relocation distances or runtime memory (they’re non-allocated ELF sections).
Operational practice: ship stripped binaries, keep symbol files in a “symbol DB” for post-mortem debugging.

Code size, dead code, and optimizations

Many argue that hitting a 2 GiB .text limit signals missing dead-code elimination: use LTO, -ffunction-sections + --gc-sections, identical code folding, tree-shaking, or better partitioning.
Others counter that even with these, large monolithic C++ services can genuinely approach 2 GiB of code.

Code models, thunks, and relocation limits

Discussion dives into x86-64 code models and the 2 GiB relative jump/call limit.
Medium/large code models, thunks/trampolines, and post-link optimizers like BOLT are discussed as strategies, each with performance tradeoffs.
It’s noted that a proper range-extension thunk ABI for x86-64 would be preferable to pessimistically upgrading everything to the large code model.

Related topics