I hate compilers

Reproducible builds & compiler determinism

  • Strong disagreement over how important reproducible builds are.
    • Pro side: needed to verify distro binaries (e.g., xz), check GPL compliance, support military or high‑assurance users who recompile and compare hashes, and give auditors proof that a shipped binary corresponds to reviewed code.
    • Skeptical side: sees these as mostly theoretical; prefers signatures and CI traceability over bit‑identical outputs; argues users who care should build from source themselves.
  • Concrete issues:
    • Non‑deterministic compiler behavior due to data structures with pointer-based or undefined iteration order (e.g., LLVM’s DenseMap) and ASLR affecting internal layout; suggested fixes involve deterministic containers (MapVector) and treating this as compiler bugs.
    • Timestamps and build‑time macros like __DATE__/__TIME__ are cited as “easy” ways to break reproducibility. Debate over whether this is an obvious and intentional tradeoff or an easy accidental footgun; some argue such macros don’t belong in compilers.
  • Nix/Guix:
    • Praised for hermetic build environments (e.g., fixing time via SOURCE_DATE_EPOCH, input hashing).
    • Others note they don’t solve compiler bugs or concurrency‑dependent non‑determinism.

LLMs, binaries, and compilers

  • One line of discussion suggests training LLMs directly on binaries to generate machine code.
  • Most responses are skeptical:
    • Small binary perturbations break executables; there is no “plausible” binary.
    • LLMs struggle with counting and fine‑grained precision (e.g., exact jump offsets), even if they do well at high‑level or vague tasks.
    • Any serious harness would end up reintroducing compiler/assembler‑like structure.
  • Side debate about “hard” vs “easy” tasks for humans vs LLMs and the relevance of Moravec’s paradox.

Low-level vs high-level environments

  • One view: low‑level work is highly environment‑specific (hardware, CPU generations, vendor protocols), full of hidden assumptions; LLMs therefore struggle more there than at high‑level “wrapping.”
  • Counterview: simple, dependency‑light procedural code is relatively stable compared to fast‑churning web frameworks; discussion over what counts as “low‑level” (C + OS libs vs GUI/web frameworks).

Anubis proof‑of‑work protection

  • Critics liken client‑side PoW challenges to malware or crypto miners: wasted energy, accessibility concerns, and limited deterrence against large AI companies with ample compute.
  • Supporters frame it as a necessary “nuclear option” to throttle abusive scrapers and keep hosting costs manageable, with challenge difficulty adapted to client signals (residential IP, behavior).
  • Some propose making PoW do “useful work” (crypto, protein folding), but others worry this would be seen as botnet‑like; technical hurdles (data size, coordination) are noted.