rr – record and replay debugger for C/C++
What rr is and how it’s used
- Record-and-replay debugger mainly for Linux native binaries, often used with gdb frontends (cgdb, IDEs, delve for Go, etc.).
- Typical workflow: record a failing run, then replay repeatedly to inspect state, including for large C/C++ projects, JITs, MPI jobs, mixed Python/native stacks, and Julia.
Strengths and “killer features”
- Reverse execution: set a watchpoint on a variable and “reverse-continue” to where it last changed; many describe this as transformative for debugging tricky bugs and reverse engineering.
- Works with sanitizers (ASan, MSan; TSAN not confirmed) so you can record a sanitizer-triggered run and step backward to root cause.
- Overhead is often modest because normal CPU execution runs at native speed; rr mainly records syscalls and nondeterminism.
Limitations and friction
- Struggles with some concurrency bugs: single-threaded execution model plus “chaos mode” only exposes coarser races, and it doesn’t help with weak memory ordering.
- No GPU support; CUDA/OpenGL/Vulkan and drivers that modify process memory directly are problematic, though some workarounds (e.g., VirGL, software GL) are mentioned.
- Linux-only; users miss it on macOS and Windows. Some commercial alternatives exist but can be very expensive.
- Android/Kernel debugging with rr is raised as an idea; no clear success reports.
- Past issues with Ryzen/Threadripper exist but are reported as resolved with documented workarounds.
Language and tooling integration
- Not limited to C/C++; works with any native code with DWARF symbols (Rust, Zig, Go, Julia, RPython variants, etc.).
- For managed languages (Python, JS), rr can debug at the interpreter/VM level; higher-level support is limited but partially achievable via gdb extensions.
Comparisons to other tools
- gdb’s built-in reverse debugging predates rr but is widely described as orders of magnitude slower and far more limited (single-threaded, small snippets).
- WinDbg time travel uses instruction-level emulation on Windows, with 10–20× slowdown versus rr’s ~2× or less in many cases.
- Undo.io and Pernosco (commercial, based on rr) extend capabilities: handling drivers/unrecorded processes and providing a queryable execution history.
- Browser/Javascript-specific replay tools (e.g., replay.io) are mentioned separately.
Rust rewrite and memory-safety debate
- A partial Rust port of rr exists; maintainers cite huge accumulated edge-case handling and ecosystem dependencies as barriers to adopting it.
- Broader discussion on whether rewriting “working” C/C++ systems in Rust is worth it: trade-offs between stability of mature C code, Rust’s safety and abstraction benefits, and migration cost.
- Government guidance to prefer memory-safe languages is noted, but real-world adoption constraints (existing Java/C++ stacks) are acknowledged.