Moving beyond fork() + exec()
Cost and behavior of fork()+exec()
- Several comments criticize the article’s phrasing that fork “copies the entire memory”; they stress copy‑on‑write, but note the real cost is copying page tables and VMAs and setting up COW, which is O(process size).
- For very large processes (tens of GBs), fork pauses can reach seconds; examples include Redis and large JVMs. Zygote helpers are often used to avoid forking from huge processes.
vfork()reduces some costs by avoiding COW, but has strict usage constraints and interacts badly with threads on some systems.- Overcommit and COW are seen as almost required to make fork fast, but that in turn constrains OS design.
Process creation models and alternatives
- Many advocate spawn-style primitives:
posix_spawn(native on some OSes),CreateProcesson Windows,NtCreateProcess, XNU’s non‑forkposix_spawn, Mach’s “task/address space/threads” model. - Ideas proposed: a “blank” process created by a syscall, then configured via other syscalls (possibly taking a pidfd); or a tiny loader process that runs config then
exec. - Some want existing APIs (setuid, fd operations, etc.) to accept target process handles so configuration can be done externally rather than in a forked child.
- Others suggest using
ptrace-like mechanisms or eBPF hooks to drive configuration without new per‑feature spawn flags.
State inheritance and configuration
- A repeated complaint: most real use cases want “launch a new program with a small, explicit set of inherited things,” but fork’s default is “inherit everything then painstakingly exclude.”
- File descriptors are the biggest pain point;
O_CLOEXECand fd iteration are seen as error‑prone. - A truly “share nothing” spawn is hard to define: containers, cgroups, namespaces, UID/gid, signals, environment, and more all interact in non‑obvious ways.
- Multi‑threaded parents make fork fragile: only one thread survives in the child, so in‑memory locks and malloc state can be left inconsistent; hence the POSIX restriction to async‑signal‑safe functions between fork and exec.
Use cases and performance sensitivity
- Some argue “spawning shouldn’t be on the hot path”; others note that build systems, fuzzers, heavy process isolation, browsers, and mobile platforms contradict this in practice.
- Zygote patterns (pre‑forked helpers) are widely used but considered hard to retrofit and brittle.
Elegance vs “historical hack”
- One camp praises fork+exec as simple, composable, and key to Unix shell power; alternative spawn APIs with huge parameter structs are seen as ugly and less extensible.
- The opposing camp calls fork an outdated 1970s hack that forces awkward abstractions (e.g., encoding APIs via numbered fds), complicates threading, and ossifies kernel internals.
- Several note that if Unix had started with spawn+configure, many syscalls would have been designed with explicit process arguments, simplifying debugging and supervision.
Shared libraries and memory
- Misconception corrected: shared libraries like
libgccare mapped once and shared across processes viammap; only relocation/GOT data is per‑process. - This sharing is cited as a major upside of dynamic linking.