The 70% AI productivity myth: why most companies aren't seeing the gains

Perceived Productivity Gains

  • Strong split: some claim LLMs have no proven productivity benefit and often slow experienced devs; others report 10–20% gains or dramatic acceleration on greenfield work and small tools.
  • Several anecdotes of “a year of R&D in two months” or rapid MVPs, but with concerns about the hardening/maintenance phase still ahead.
  • Many say LLMs reduce cognitive load and make work feel easier, which may be mistaken for true productivity.

Quality, Tech Debt, and Code Review

  • Common concern: AI-generated code creates heavy, fast-accumulating tech debt—duplicated logic, unused branches, reintroduced bugs, and incomprehensible structures.
  • Two options are described: painstaking line‑by‑line review (eroding speed gains) or paying an unpredictable “debt interest” later.
  • “Yolo AI coding” is compared to a payday loan: immediate relief, long-term pain.
  • Some mitigate this by forcing LLMs to write tests and docs, then validating both carefully; others cascade AI-on-AI code review, which may just be “turtles all the way down.”

Junior Developers and Non-Programmers

  • LLMs let interns and non‑CS people build UIs and small apps they couldn’t have created before.
  • Disagreement over whether this is “real” competence or facilitated illiteracy in new grads who can’t code without AI.
  • Advice to juniors: use AI to explain systems and generate simple functions, but design data structures, algorithms, and key logic yourself.

Context: Startups vs Enterprises

  • Consensus that biggest perceived gains are in:
    • Small teams and greenfield projects.
    • Boilerplate-heavy frontend and “easy” backend/ops tasks.
  • In large organizations, gains are limited by:
    • Legacy, complex codebases beyond a model’s context window.
    • Heavy processes (reviews, compliance, security tools) and underpowered corporate machines.
    • Poor governance and “cargo-cult” adoption of big-tech patterns and SaaS.

Evidence and the METR Study

  • METR study: experienced OSS devs thought AI made them faster but actually became ~19% slower.
  • People see it as evidence that:
    • Developers are bad at self-assessing AI productivity.
    • Short-term experiments with unfamiliar tools understate potential long-term benefits.
  • Noted that incentives heavily favor publishing “insane improvement” studies; the relative lack of such credible results is seen as informative.

Hype, Measurement, and Future Use

  • Many criticize “evolve or die” AI marketing and inflated “70% productivity” claims; they argue it fuels backlash.
  • Measuring knowledge-work productivity is seen as intrinsically hard; stats are treated with skepticism.
  • Some expect that real gains will require new workflows (more planning, testing, and review around AI), organizational change, and time for “AI fluency” to develop.