Translating All C to Rust (TRACTOR)
Program goals and DARPA context
- TRACTOR aims for high‑automation translation of existing C to Rust, ideally producing code like a skilled Rust developer and eliminating memory‑safety bugs.
- Commenters stress DARPA routinely funds “DARPA‑hard” problems: success is not guaranteed; partial wins and research outputs are expected.
- Prior related efforts exist (c2rust, ROSE, CRAM, Managed Sulong), some also DARPA/DOE funded.
Feasibility of C→Rust translation
- Many think full, automatic translation to safe idiomatic Rust is extremely hard:
- Need to reconstruct lifetimes, ownership, aliasing, array sizes, concurrency discipline, and intended invariants that C doesn’t encode.
- C often contains real memory bugs; there is no unique “correct” safe Rust equivalent.
- Non‑affine pointer use, back‑references, unions, macros, and threading make borrow‑checker‑friendly structures nontrivial.
- Likely outputs:
- Mechanical translation to mostly
unsafeRust (as c2rust does) is seen as doable but of limited value: bug‑compatible, unreadable, hard to refactor. - Some suggest interactive tools: translate as far as possible, flag problem areas, and guide humans to redesign those parts.
- Mechanical translation to mostly
Rust, undefined behavior, and Miri
- Strong claim in thread: by design, safe Rust (excluding compiler bugs and unsafe code beneath abstractions) cannot exhibit UB; UB always originates in
unsafe. - Counterpoints:
- UB can “leak” into safe code via unsound unsafe libraries or std; there have been such bugs.
- Compiler and LLVM bugs exist; these are treated as implementation bugs, not language‑level UB.
- Miri (a MIR interpreter) is discussed:
- Used to detect UB in unsafe Rust; useful but not sufficient to guarantee safety.
- Some argue referencing Miri doesn’t refute Rust’s safety model; others use it to highlight real UB cases and temper absolutist safety rhetoric.
Alternatives to “rewrite it in Rust”
- Several argue improving C/C++ with:
- Bounded model checking (CBMC, Kani, Frama‑C), sanitizers, strict coding standards (MISRA, JPL), and safer libraries.
- Claim: you can get very strong guarantees for existing C if you invest in these tools.
- Others respond that:
- Industry has had decades to adopt such practices; memory‑safety CVEs still dominate.
- Safer defaults (Rust, Ada/SPARK, managed languages) reduce reliance on exceptional discipline.
AI’s potential role
- Some see LLMs + verification as promising: mechanical C→Rust pass, then AI‑guided refactoring, checked by compilers, tests, fuzzers, sanitizers, Miri.
- Skeptics note LLMs are weak at long‑range correctness; automatic translation may still require heavy human validation.
Maintainability and organizational issues
- Concern that machine‑translated “shitty Rust” will be harder to maintain than the original C and drive developers away.
- For long‑lived systems, teams must actually understand and be able to evolve the translated Rust; otherwise C remains the de facto source.