The unexpected effectiveness of one-shot decompilation with Claude
LLMs as Decompilation and RE Tools
- Many commenters report strong results using Claude and other LLMs (especially with Ghidra/IDA) to:
- Clean up decompiled C, infer function purposes, and identify assembly tricks.
- Comment JIT output or highly optimized/minified code, and compare compiler outputs.
- Gemini is noted as also good at assembly and bytecode-level tasks; Codex is seen as more tuned for mainstream dev work.
Workflows, Heuristics, and Tooling
- The post’s “headless loop + heuristics + compiler match” approach is praised as a concrete, useful pattern.
- Key techniques:
- Work function-by-function when possible; whole-file input is sometimes needed when registers are reused unpredictably.
- Use a “give up after N attempts” heuristic to cap wasted tokens.
- Exploit large context windows to analyze wide code regions and trace flows.
- Some want more structured, step‑by‑step tutorials and tighter grammars for valid C, but others say simple “compile + feed errors back” loops are enough.
Limits, Complexity, and Non‑Expert Use
- Commenters warn that one‑shot reverse engineering for non‑experts is still weak; you must give the model tight constraints, goals, and validation.
- LLMs often misestimate task difficulty and duration—both over‑ and under‑shooting.
- There’s debate over what “one‑shot” means (single prompt vs single example vs non-interactive loop).
Documentation and Developer Workflow
- Many see LLMs as excellent for generating “how it works” docs, translating and synthesizing sparse or foreign‑language documentation.
- Skepticism about auto‑invented rationales (“why it’s this way”); human review is desired.
- Some argue LLMs reduce the need for human docs; others frame docs as an “error-correcting code” to detect mismatches between intent and implementation.
Legal, Licensing, and Privacy Concerns
- Strong thread on distinctions between “open source” vs “source available” and how decompilations are derivative works with their own, but constrained, licensing.
- Clean‑room reverse engineering is contrasted with distributing decompiled code.
- Several raise concerns about uploading copyrighted binaries to cloud LLMs: potential evidence trails, DMCA/fair‑use ambiguity, and jurisdictional risks.
Decompilation, Obfuscation, and the Future of Software
- Some speculate that near‑trivial decompilation could make most binaries effectively “source available,” provoking shifts to cloud‑only or hardware‑locked distribution.
- Others expect counter‑moves: LLM‑assisted obfuscation or exotic schemes (e.g., homomorphic VMs) to make analysis harder.
- There’s disagreement on timelines: some think “everything decompilable” is far off; others see it as inevitable and beneficial for preservation.
Game Preservation and Retro Computing
- Multiple examples of LLM‑assisted ports and analysis: classic BIOSes, Prince of Persia on Apple II, and older PC/console games.
- Matching original binaries requires reconstructing old toolchains and flags; flakiness and inter‑function dependencies often prevent 100% exact matches, but “99%+ matching, 100% functional” is common.