2025-12-30

The 70% AI productivity myth: why most companies aren't seeing the gains

Perceived Productivity Gains

Strong split: some claim LLMs have no proven productivity benefit and often slow experienced devs; others report 10–20% gains or dramatic acceleration on greenfield work and small tools.
Several anecdotes of “a year of R&D in two months” or rapid MVPs, but with concerns about the hardening/maintenance phase still ahead.
Many say LLMs reduce cognitive load and make work feel easier, which may be mistaken for true productivity.

Quality, Tech Debt, and Code Review

Common concern: AI-generated code creates heavy, fast-accumulating tech debt—duplicated logic, unused branches, reintroduced bugs, and incomprehensible structures.
Two options are described: painstaking line‑by‑line review (eroding speed gains) or paying an unpredictable “debt interest” later.
“Yolo AI coding” is compared to a payday loan: immediate relief, long-term pain.
Some mitigate this by forcing LLMs to write tests and docs, then validating both carefully; others cascade AI-on-AI code review, which may just be “turtles all the way down.”

Junior Developers and Non-Programmers

LLMs let interns and non‑CS people build UIs and small apps they couldn’t have created before.
Disagreement over whether this is “real” competence or facilitated illiteracy in new grads who can’t code without AI.
Advice to juniors: use AI to explain systems and generate simple functions, but design data structures, algorithms, and key logic yourself.

Context: Startups vs Enterprises

Consensus that biggest perceived gains are in:
- Small teams and greenfield projects.
- Boilerplate-heavy frontend and “easy” backend/ops tasks.
In large organizations, gains are limited by:
- Legacy, complex codebases beyond a model’s context window.
- Heavy processes (reviews, compliance, security tools) and underpowered corporate machines.
- Poor governance and “cargo-cult” adoption of big-tech patterns and SaaS.

Evidence and the METR Study

METR study: experienced OSS devs thought AI made them faster but actually became ~19% slower.
People see it as evidence that:
- Developers are bad at self-assessing AI productivity.
- Short-term experiments with unfamiliar tools understate potential long-term benefits.
Noted that incentives heavily favor publishing “insane improvement” studies; the relative lack of such credible results is seen as informative.

Hype, Measurement, and Future Use

Many criticize “evolve or die” AI marketing and inflated “70% productivity” claims; they argue it fuels backlash.
Measuring knowledge-work productivity is seen as intrinsically hard; stats are treated with skepticism.
Some expect that real gains will require new workflows (more planning, testing, and review around AI), organizational change, and time for “AI fluency” to develop.

Related topics