Speeding up Ruby by rewriting C in Ruby

Ruby, YJIT, and alternative implementations

  • Discussion notes the idea of a “Ruby stdlib in Ruby” predates YJIT (e.g., Rubinius, TruffleRuby), with mixed past results (Rubinius slower than MRI).
  • TruffleRuby is highlighted as extremely fast and capable of treating C extensions like Ruby code, allowing JIT optimization of C paths.
  • YJIT’s implementation history (from C to Rust) is mentioned; Rust is seen as a good trade-off despite build-toolchain friction.
  • Some report mixed real-world speedups from TruffleRuby vs MRI and stress careful benchmarking due to startup and warmup behavior.
  • TruffleRuby is open source and based on Graal; seen as “forkable” if Oracle ever changes direction.

Rails on TruffleRuby

  • One view: Rails “doesn’t work” on TruffleRuby and won’t soon, especially with Rails 8 requiring Ruby 3.2.
  • Counterpoint: TruffleRuby claims to run Rails and many gems; not being “100% MRI 3.2 compatible” doesn’t necessarily mean Rails is broken.
  • Overall status of full Rails compatibility is unclear from the thread.

Benchmarks, microbenchmarks, and interpretation

  • Some argue microbenchmarks are often dismissed too quickly: they do expose real issues (e.g., high function-call overhead in dynamic languages).
  • Others stress they are narrow: you can’t responsibly claim “X is N× slower than Y in general” from a tiny benchmark.
  • Links to larger benchmark suites (e.g., Benchmarks Game, other repos) are cited to show wide variance across implementations and tasks.
  • Methodological criticisms appear: too few runs, using wall-clock time, lack of JMH for JVM tests, and ignoring startup costs.

Python performance, C libraries, and mission-critical use

  • Several comments note that many Python workloads push heavy computation into C/Fortran libraries; Python acts as glue.
  • Others respond that any language with FFI can do this; the baseline slowness of pure Python still matters.
  • Debate over acceptability in constrained or mission-critical systems:
    • Some describe successful use of Python even on a satellite where extra milliseconds and milliwatts are acceptable.
    • Others argue that for highly power- or latency-sensitive systems (e.g., long-endurance drones), interpreter overhead and GC are prohibitive.
  • Concerns raised about dynamic languages for mission-critical software, even with optional static typing tools.

Other language comparisons (Dart, Crystal, LuaJIT, JVM languages)

  • Dart’s strong showing surprises some, especially versus C# and LuaJIT; others point out that tiny benchmarks may be dominated by specific optimizations.
  • Background on Dart’s VM lineage (from teams behind Self, HotSpot, V8) and its AOT+JIT design is mentioned.
  • Crystal is brought up as a Ruby-like compiled language with Rails-esque frameworks and static binaries; some think omitting it from Ruby-speed discussions is odd.
  • Others counter that Crystal is not Ruby and doesn’t help existing Ruby codebases.
  • Node vs Deno and Java vs Kotlin differences are attributed to JVM optimization focus and extra bytecode generated by “guest” languages.

Benchmark design and visualization critiques

  • The core Ruby benchmark (nested loops with array updates) is called “weird” and easy to algebraically collapse, suggesting it mostly measures a trivial hot loop.
  • Some note compilers generally don’t do liveness analysis for individual array elements due to cost, even when it could enable dramatic simplifications.
  • The article’s animated visualization of language speeds is criticized as distracting and hard to read quantitatively; static tables or bar charts are preferred by some, while others find the animation intuitive enough.