rr – record and replay debugger for C/C++

What rr is and how it’s used

  • Record-and-replay debugger mainly for Linux native binaries, often used with gdb frontends (cgdb, IDEs, delve for Go, etc.).
  • Typical workflow: record a failing run, then replay repeatedly to inspect state, including for large C/C++ projects, JITs, MPI jobs, mixed Python/native stacks, and Julia.

Strengths and “killer features”

  • Reverse execution: set a watchpoint on a variable and “reverse-continue” to where it last changed; many describe this as transformative for debugging tricky bugs and reverse engineering.
  • Works with sanitizers (ASan, MSan; TSAN not confirmed) so you can record a sanitizer-triggered run and step backward to root cause.
  • Overhead is often modest because normal CPU execution runs at native speed; rr mainly records syscalls and nondeterminism.

Limitations and friction

  • Struggles with some concurrency bugs: single-threaded execution model plus “chaos mode” only exposes coarser races, and it doesn’t help with weak memory ordering.
  • No GPU support; CUDA/OpenGL/Vulkan and drivers that modify process memory directly are problematic, though some workarounds (e.g., VirGL, software GL) are mentioned.
  • Linux-only; users miss it on macOS and Windows. Some commercial alternatives exist but can be very expensive.
  • Android/Kernel debugging with rr is raised as an idea; no clear success reports.
  • Past issues with Ryzen/Threadripper exist but are reported as resolved with documented workarounds.

Language and tooling integration

  • Not limited to C/C++; works with any native code with DWARF symbols (Rust, Zig, Go, Julia, RPython variants, etc.).
  • For managed languages (Python, JS), rr can debug at the interpreter/VM level; higher-level support is limited but partially achievable via gdb extensions.

Comparisons to other tools

  • gdb’s built-in reverse debugging predates rr but is widely described as orders of magnitude slower and far more limited (single-threaded, small snippets).
  • WinDbg time travel uses instruction-level emulation on Windows, with 10–20× slowdown versus rr’s ~2× or less in many cases.
  • Undo.io and Pernosco (commercial, based on rr) extend capabilities: handling drivers/unrecorded processes and providing a queryable execution history.
  • Browser/Javascript-specific replay tools (e.g., replay.io) are mentioned separately.

Rust rewrite and memory-safety debate

  • A partial Rust port of rr exists; maintainers cite huge accumulated edge-case handling and ecosystem dependencies as barriers to adopting it.
  • Broader discussion on whether rewriting “working” C/C++ systems in Rust is worth it: trade-offs between stability of mature C code, Rust’s safety and abstraction benefits, and migration cost.
  • Government guidance to prefer memory-safe languages is noted, but real-world adoption constraints (existing Java/C++ stacks) are acknowledged.