1B nested loop iterations
Benchmark design and realism
- Many view “1B nested loop iterations” as a highly contrived microbenchmark.
- It mainly measures how compilers optimize tight integer loops with modulo, not real-world workloads with allocations, branches, indirections, or objects.
- Some argue this is still representative of “average bad code” heavy on loops and arithmetic; others say such hot loops are a tiny fraction of real execution.
- Several commenters stress that div/mod is unusually slow, so this underestimates C/Rust capabilities on more typical arithmetic.
- Concerns raised that the benchmark encourages misleading language comparisons without clear methodology or caveats.
Garbage collection and performance consistency
- Discussion emphasizes that beyond raw speed, GC’d languages face issues with startup time and pause consistency.
- Modern JVM collectors (e.g., Shenandoah, G1) reportedly achieve sub-millisecond pauses, but GC remains a concern for latency-sensitive domains (games, VR).
- Game dev anecdotes: hitches often stem from excessive short-lived allocations in render loops; object pools and careful allocation patterns help.
Language-specific observations
- Go appears slower than C/C++/Rust largely because the Go version uses 64-bit ints vs. 32-bit in others; 64-bit modulo is significantly slower and worse for cache. With int32 and GC tweaks, Go gets closer to Java/C++.
- PyPy is vastly faster than CPython on this benchmark due to JIT and arithmetic-friendly workload, though commenters note this exaggerates typical speedups.
- R and Python are said to benefit enormously from vectorized operations (e.g., using seq_len/sum or NumPy) rather than explicit loops.
- JavaScript’s performance in both Deno and browsers surprises some, though differences between engines (Chrome vs. Firefox) are noted.
Visualization and communication
- The moving-circle visualization is praised as intuitive by some and criticized as confusing or no better than a bar chart by others.
- Several stress that microbenchmarks must be interpreted cautiously and ideally complemented with broader, more realistic benchmark suites.