AI slows down open source developers. Peter Naur can teach us why

Study findings and perception gap

  • Developers in the cited RCT expected ~20% speedup from AI and felt ~20% faster afterward, but actually ranged from no gain to ~40% slower.
  • Commenters link this to a general human inability to accurately perceive time and productivity; people judge “how busy I felt” rather than outcome.
  • Analogies raised: keyboard vs mouse studies, Waze choosing “busy-feeling” routes, and gambling-like reinforcement where AI “feels” helpful even when it isn’t.

Debate over study validity and scope

  • The paper only covers early‑2025 tools, experienced OSS maintainers, large familiar repos, and tasks randomized into “AI allowed” vs “no AI.”
  • Critics highlight: only 16 devs, wide confidence intervals, self‑reported time, selection of issues by maintainers, most were new to Cursor, and possible ordering/spillover effects.
  • The authors respond that multiple factors likely contribute to slowdown, not a single cause, and that a key robust result is the mismatch between perceived and measured productivity.
  • Several participants stress that results shouldn’t be over‑generalized to all devs, all tasks, or future models.

Flow, context switching, and mandated tools

  • Many describe AI interactions as breaking flow: each prompt/review cycle disrupts concentration and increases fatigue.
  • Mandatory use of AI IDEs (e.g., Cursor) is reported as demoralizing, with some feeling clearly slower but socially pressured not to say so.

Mental models, familiarity, and where AI helps

  • Tied to Naur’s “Programming as Theory Building,” several argue that when you already have a rich mental model of a codebase, AI mostly gets in the way.
  • Others find AI very useful for:
    • Ramp‑up on unfamiliar repos (asking “where is X implemented?”, “which files to change?”).
    • Greenfield features, one‑off scripts, boilerplate, tests, and learning new languages.
  • There’s concern that fast ramp‑up via AI may shortcut deep understanding, leaving a permanent knowledge gap.

Quality, maintenance, and AI‑generated code

  • Maintainers report low‑quality AI PRs: muting errors instead of fixing root causes, over‑refactoring, noisy try/excepts, and “commits for the resume.”
  • Review cost rises because AI can cheaply generate large, shallow changes that still require careful human scrutiny.
  • Some use AI mainly as a “rubber duck” or critic—asking it to find bugs or poke holes in designs—reporting higher quality but not speed.

Broader attitudes and future trajectory

  • Views range from “AI cult / emperor’s new clothes / hype like web3” to “this already gives huge speedups for me; anecdotes matter more than one study.”
  • Several emphasize that effective AI use is a distinct skill, tools are improving rapidly, and the key question is which tasks and workflows AI actually benefits.