2025-02-09

Baffled by generational garbage collection – wingolog

When generational GC helps (and when it doesn’t)

Several comments stress that generational GC only shines when workloads match the “most objects die young, survivors live long” pattern and there is a large mature heap.
Benchmarks like splay may churn the entire heap, so minor collections give little benefit and can even add extra copying cost.
Big real-world systems (servers, GUIs, dynamic languages) often do fit the generational profile: many short-lived temporaries per request/event, plus a large, mostly-stable object graph.

Java and JVM techniques

JVM is said to handle short-lived objects very well via Thread-Local Allocation Buffers (bump-pointer allocation) and escape analysis, sometimes effectively turning heap allocations into stack-like behavior.
Generational, pauseless collectors (G1, Shenandoah, ZGC) are described as state-of-the-art; ZGC in particular is cited for sub-millisecond pauses.
Object pools in Java were historically used to reduce GC pressure but are criticized:
- Can harm generational GC by creating many old-to-young references, forcing more frequent/expensive full GCs.
- Are error-prone (double-returns, use-after-free of pooled objects).
Binary-trees benchmark is used as an example where Java’s generational + TLAB approach competes with or beats arena-based native code.

Go’s non-generational, concurrent GC

Go’s GC is concurrent, non-generational, and explicitly optimized for latency, with “GC assist” that throttles heavy allocators.
Supporters like that it rarely breaks SLOs and usually “just works” without tuning.
Critics say throughput is worse than modern JVM/.NET GCs and there are scaling issues at large heaps or high core counts, with few tuning “escape hatches.”
Go relies heavily on stack allocation via escape analysis, so it simply creates less garbage.

.NET and C# perspectives

.NET has long used generational GC with server/workstation modes and newer options to behave more or less “cooperatively” with other processes.
It’s portrayed as close to Java in GC sophistication, often more memory‑efficient, and helped by value types and lower allocation rates.
Discussion diverges into whether highly optimized C# can rival C/Rust performance; opinions differ, but many agree C# has gained powerful low-level features (spans, ref structs, custom allocators).

Object pools, arenas, and the memory-management continuum

Commenters emphasize memory management as a continuum: GC, malloc variants, arenas, static layouts, custom allocators can all be mixed.
Java now has arena allocators; performance‑sensitive code in multiple ecosystems sometimes bypasses GC-managed heaps entirely.

Benchmarks, bias, and the generational hypothesis

Several note a methodological risk: if you design benchmarks and software assuming cheap young-object allocation, you’ll naturally validate the generational hypothesis.
Some suggest richer instrumentation (e.g., per-callsite lifetime profiling, pretenuring, multiple generations sized to cache/request lifetimes) to better test and tune generational collectors.

Debate over Whippet and Immix

One thread criticizes the blog author’s GC choices (mark-sweep, Immix) as “not proper” compared to classic semispace+generational designs; others push back, calling Immix state-of-the-art and noting the author has written extensively about advanced GC techniques.
There’s some confusion around what was said in a talk vs what exists in the code and prior posts, and no clear consensus on the quality of the author’s GC implementations.

Related topics