Speeding up Ruby by rewriting C in Ruby
Ruby, YJIT, and alternative implementations
- Discussion notes the idea of a “Ruby stdlib in Ruby” predates YJIT (e.g., Rubinius, TruffleRuby), with mixed past results (Rubinius slower than MRI).
- TruffleRuby is highlighted as extremely fast and capable of treating C extensions like Ruby code, allowing JIT optimization of C paths.
- YJIT’s implementation history (from C to Rust) is mentioned; Rust is seen as a good trade-off despite build-toolchain friction.
- Some report mixed real-world speedups from TruffleRuby vs MRI and stress careful benchmarking due to startup and warmup behavior.
- TruffleRuby is open source and based on Graal; seen as “forkable” if Oracle ever changes direction.
Rails on TruffleRuby
- One view: Rails “doesn’t work” on TruffleRuby and won’t soon, especially with Rails 8 requiring Ruby 3.2.
- Counterpoint: TruffleRuby claims to run Rails and many gems; not being “100% MRI 3.2 compatible” doesn’t necessarily mean Rails is broken.
- Overall status of full Rails compatibility is unclear from the thread.
Benchmarks, microbenchmarks, and interpretation
- Some argue microbenchmarks are often dismissed too quickly: they do expose real issues (e.g., high function-call overhead in dynamic languages).
- Others stress they are narrow: you can’t responsibly claim “X is N× slower than Y in general” from a tiny benchmark.
- Links to larger benchmark suites (e.g., Benchmarks Game, other repos) are cited to show wide variance across implementations and tasks.
- Methodological criticisms appear: too few runs, using wall-clock
time, lack of JMH for JVM tests, and ignoring startup costs.
Python performance, C libraries, and mission-critical use
- Several comments note that many Python workloads push heavy computation into C/Fortran libraries; Python acts as glue.
- Others respond that any language with FFI can do this; the baseline slowness of pure Python still matters.
- Debate over acceptability in constrained or mission-critical systems:
- Some describe successful use of Python even on a satellite where extra milliseconds and milliwatts are acceptable.
- Others argue that for highly power- or latency-sensitive systems (e.g., long-endurance drones), interpreter overhead and GC are prohibitive.
- Concerns raised about dynamic languages for mission-critical software, even with optional static typing tools.
Other language comparisons (Dart, Crystal, LuaJIT, JVM languages)
- Dart’s strong showing surprises some, especially versus C# and LuaJIT; others point out that tiny benchmarks may be dominated by specific optimizations.
- Background on Dart’s VM lineage (from teams behind Self, HotSpot, V8) and its AOT+JIT design is mentioned.
- Crystal is brought up as a Ruby-like compiled language with Rails-esque frameworks and static binaries; some think omitting it from Ruby-speed discussions is odd.
- Others counter that Crystal is not Ruby and doesn’t help existing Ruby codebases.
- Node vs Deno and Java vs Kotlin differences are attributed to JVM optimization focus and extra bytecode generated by “guest” languages.
Benchmark design and visualization critiques
- The core Ruby benchmark (nested loops with array updates) is called “weird” and easy to algebraically collapse, suggesting it mostly measures a trivial hot loop.
- Some note compilers generally don’t do liveness analysis for individual array elements due to cost, even when it could enable dramatic simplifications.
- The article’s animated visualization of language speeds is criticized as distracting and hard to read quantitatively; static tables or bar charts are preferred by some, while others find the animation intuitive enough.