2025-07-14

AI slows down open source developers. Peter Naur can teach us why

Study findings and perception gap

Developers in the cited RCT expected ~20% speedup from AI and felt ~20% faster afterward, but actually ranged from no gain to ~40% slower.
Commenters link this to a general human inability to accurately perceive time and productivity; people judge “how busy I felt” rather than outcome.
Analogies raised: keyboard vs mouse studies, Waze choosing “busy-feeling” routes, and gambling-like reinforcement where AI “feels” helpful even when it isn’t.

Debate over study validity and scope

The paper only covers early‑2025 tools, experienced OSS maintainers, large familiar repos, and tasks randomized into “AI allowed” vs “no AI.”
Critics highlight: only 16 devs, wide confidence intervals, self‑reported time, selection of issues by maintainers, most were new to Cursor, and possible ordering/spillover effects.
The authors respond that multiple factors likely contribute to slowdown, not a single cause, and that a key robust result is the mismatch between perceived and measured productivity.
Several participants stress that results shouldn’t be over‑generalized to all devs, all tasks, or future models.

Flow, context switching, and mandated tools

Many describe AI interactions as breaking flow: each prompt/review cycle disrupts concentration and increases fatigue.
Mandatory use of AI IDEs (e.g., Cursor) is reported as demoralizing, with some feeling clearly slower but socially pressured not to say so.

Mental models, familiarity, and where AI helps

Tied to Naur’s “Programming as Theory Building,” several argue that when you already have a rich mental model of a codebase, AI mostly gets in the way.
Others find AI very useful for:
- Ramp‑up on unfamiliar repos (asking “where is X implemented?”, “which files to change?”).
- Greenfield features, one‑off scripts, boilerplate, tests, and learning new languages.
There’s concern that fast ramp‑up via AI may shortcut deep understanding, leaving a permanent knowledge gap.

Quality, maintenance, and AI‑generated code

Maintainers report low‑quality AI PRs: muting errors instead of fixing root causes, over‑refactoring, noisy try/excepts, and “commits for the resume.”
Review cost rises because AI can cheaply generate large, shallow changes that still require careful human scrutiny.
Some use AI mainly as a “rubber duck” or critic—asking it to find bugs or poke holes in designs—reporting higher quality but not speed.

Broader attitudes and future trajectory

Views range from “AI cult / emperor’s new clothes / hype like web3” to “this already gives huge speedups for me; anecdotes matter more than one study.”
Several emphasize that effective AI use is a distinct skill, tools are improving rapidly, and the key question is which tasks and workflows AI actually benefits.

Related topics