Comprehension debt: A ticking time bomb of LLM-generated code
Scope of “Comprehension Debt”
- Many see this as an old problem (legacy systems, offshore code, intern code) that LLMs greatly amplify rather than create anew.
- Others argue LLM code is qualitatively different: there may be no human mental model behind it at all, only a plausible-looking surface.
Human vs LLM Code and Institutional Knowledge
- Human-written code often comes with institutional memory, design docs, tickets, and the possibility of asking “why?”—even if imperfectly.
- LLMs can explain what code does, but commenters doubt they can reliably explain why it’s structured that way or which trade‑offs were intended.
- Several connect this to “programming as theory building”: LLMs remove even the incidental theory-building you get from manually typing the code.
Tests, Specs, and Design as Counterweights
- Many propose spec‑driven or test‑driven workflows: have LLMs generate code plus tests, enforce style/architecture rules, and treat specs as the real artifact.
- Critics note LLM tests often mirror the same misunderstanding as the code, so both must still be reviewed; tests can become vacuous or wrong.
- Strong modularization, explicit interfaces, and richer documentation (possibly LLM‑assisted) are seen as key to containing comprehension debt.
Workflow, Quality, and Management Incentives
- Concern that management treats AI as a pure speed multiplier, pressuring reviewers to rubber‑stamp growing volumes of opaque code.
- Fear that this accelerates existing “barely functional” quality norms and drives out engineers who care about design and polish.
- Some liken LLM coding to earlier waves of sloppy abstraction (EJBs, ORMs, JS frameworks), but at far higher volume and speed.
Where LLMs Work Well (Today)
- Refactoring under strong test coverage; bulk mechanical changes (API shifts, renames).
- One‑off utilities, data munging scripts, sample code, and boilerplate.
- Helping understand unfamiliar or legacy codebases by answering localized “what does this do?” questions—though hallucinated explanations are a risk.
Future Trajectories and Disagreement
- Optimists expect future models to handle both comprehension and maintenance of LLM‑generated spaghetti, making today’s debt moot.
- Skeptics doubt core issues (hallucinations, lack of genuine understanding, ambiguous natural‑language “specs”) will vanish quickly, and worry about long‑term skill atrophy and write‑only codebases.