AI slows down open source developers. Peter Naur can teach us why
Study findings and perception gap
- Developers in the cited RCT expected ~20% speedup from AI and felt ~20% faster afterward, but actually ranged from no gain to ~40% slower.
- Commenters link this to a general human inability to accurately perceive time and productivity; people judge “how busy I felt” rather than outcome.
- Analogies raised: keyboard vs mouse studies, Waze choosing “busy-feeling” routes, and gambling-like reinforcement where AI “feels” helpful even when it isn’t.
Debate over study validity and scope
- The paper only covers early‑2025 tools, experienced OSS maintainers, large familiar repos, and tasks randomized into “AI allowed” vs “no AI.”
- Critics highlight: only 16 devs, wide confidence intervals, self‑reported time, selection of issues by maintainers, most were new to Cursor, and possible ordering/spillover effects.
- The authors respond that multiple factors likely contribute to slowdown, not a single cause, and that a key robust result is the mismatch between perceived and measured productivity.
- Several participants stress that results shouldn’t be over‑generalized to all devs, all tasks, or future models.
Flow, context switching, and mandated tools
- Many describe AI interactions as breaking flow: each prompt/review cycle disrupts concentration and increases fatigue.
- Mandatory use of AI IDEs (e.g., Cursor) is reported as demoralizing, with some feeling clearly slower but socially pressured not to say so.
Mental models, familiarity, and where AI helps
- Tied to Naur’s “Programming as Theory Building,” several argue that when you already have a rich mental model of a codebase, AI mostly gets in the way.
- Others find AI very useful for:
- Ramp‑up on unfamiliar repos (asking “where is X implemented?”, “which files to change?”).
- Greenfield features, one‑off scripts, boilerplate, tests, and learning new languages.
- There’s concern that fast ramp‑up via AI may shortcut deep understanding, leaving a permanent knowledge gap.
Quality, maintenance, and AI‑generated code
- Maintainers report low‑quality AI PRs: muting errors instead of fixing root causes, over‑refactoring, noisy try/excepts, and “commits for the resume.”
- Review cost rises because AI can cheaply generate large, shallow changes that still require careful human scrutiny.
- Some use AI mainly as a “rubber duck” or critic—asking it to find bugs or poke holes in designs—reporting higher quality but not speed.
Broader attitudes and future trajectory
- Views range from “AI cult / emperor’s new clothes / hype like web3” to “this already gives huge speedups for me; anecdotes matter more than one study.”
- Several emphasize that effective AI use is a distinct skill, tools are improving rapidly, and the key question is which tasks and workflows AI actually benefits.