2026-01-31

Data Processing Benchmark Featuring Rust, Go, Swift, Zig, Julia etc.

Java, JIT, and C++ Performance Debate

Several commenters argue the Java sample is misconfigured (SerialGC, no heap tuning, explicit System.gc()), so its poor showing vs C++ is not meaningful.
Others claim Java’s “abstraction penalty” should always leave it slower than C++, while multiple replies counter that modern JVM JITs can match or beat C++ on many workloads once warmup and heap sizing are handled correctly.
Deep dives into Java internals mention escape analysis, object flattening (Valhalla), speculation + deoptimization, and vtable inlining as reasons JIT can eliminate many overheads—though cache-unfriendly object layouts remain a real cost until inline/value types land.

Benchmark Methodology Criticisms

Many see the benchmark as “sloppy”: odd compiler/VM flags, minimal or inconsistent warmup, use of Stopwatch/wall time, GitHub Actions as a noisy environment, unclear IO/disk/cache conditions.
Code quality varies widely between languages; some implementations are obviously unoptimized or written by non-experts, undermining cross-language comparisons.
The multicore results (e.g., C# beating Go, Zig “concurrent” being slower) are widely suspected to reflect implementation details (channels, contention, SIMD usage) rather than language fundamentals.

Language-Specific Notes (Julia, Python, R, Lisp, Ruby)

Julia impresses compared to plain Python; users report 10–100x speedups when porting NumPy-heavy pipelines.
Only one Python variant is charted despite plain/numpy/numba versions existing.
R is missing; some argue it would be very slow, others say the included R code is old and uses a notoriously slow JSON package, so it’s not representative.
Common Lisp appears surprisingly slow; light tuning (types, better data structures, fewer allocations) can easily 2× it, suggesting similar easy gains likely exist in other languages.
Ruby’s multi-minute times vs sub-second others prompt questions about representativeness.

Systems, GC, and “Ignored” Languages (D, Zig, Nim, C#, Go)

D’s strong performance sparks “D gets no respect” comments; others point to ecosystem weakness and GC reliance, arguing Rust/Go/Java/C# are more compelling choices.
Zig and Odin’s weak results are blamed on poor implementations; some suspect LLM-generated code.
C# is praised for modern low-level features (SIMD, spans, stackalloc, source generators) and a strong ecosystem; its good multicore showing is attributed to explicit SIMD and contention-free parallelism.
Nim is cited as “Python-like but fast,” with LLMs making library development easier, though others are skeptical that LLMs truly lower the expertise bar.

Rules, “HO” Variants, and Broader Takeaways

Rules like “no SIMD” but “production-ready” and “must represent tags as strings” are called arbitrary and even exploitable (e.g., degenerate string encodings, interning).
Highly optimized (“HO”) versions using better data structures/algorithms can be 10–100× faster, underscoring that algorithm and design dominate language choice.
Many conclude this benchmark is fun but not authoritative; for real decisions, one should build problem-specific benchmarks or consult more rigorous suites (Benchmarksgame, Techempower, etc.).

Related topics