The 70% AI productivity myth: why most companies aren't seeing the gains
Perceived Productivity Gains
- Strong split: some claim LLMs have no proven productivity benefit and often slow experienced devs; others report 10–20% gains or dramatic acceleration on greenfield work and small tools.
- Several anecdotes of “a year of R&D in two months” or rapid MVPs, but with concerns about the hardening/maintenance phase still ahead.
- Many say LLMs reduce cognitive load and make work feel easier, which may be mistaken for true productivity.
Quality, Tech Debt, and Code Review
- Common concern: AI-generated code creates heavy, fast-accumulating tech debt—duplicated logic, unused branches, reintroduced bugs, and incomprehensible structures.
- Two options are described: painstaking line‑by‑line review (eroding speed gains) or paying an unpredictable “debt interest” later.
- “Yolo AI coding” is compared to a payday loan: immediate relief, long-term pain.
- Some mitigate this by forcing LLMs to write tests and docs, then validating both carefully; others cascade AI-on-AI code review, which may just be “turtles all the way down.”
Junior Developers and Non-Programmers
- LLMs let interns and non‑CS people build UIs and small apps they couldn’t have created before.
- Disagreement over whether this is “real” competence or facilitated illiteracy in new grads who can’t code without AI.
- Advice to juniors: use AI to explain systems and generate simple functions, but design data structures, algorithms, and key logic yourself.
Context: Startups vs Enterprises
- Consensus that biggest perceived gains are in:
- Small teams and greenfield projects.
- Boilerplate-heavy frontend and “easy” backend/ops tasks.
- In large organizations, gains are limited by:
- Legacy, complex codebases beyond a model’s context window.
- Heavy processes (reviews, compliance, security tools) and underpowered corporate machines.
- Poor governance and “cargo-cult” adoption of big-tech patterns and SaaS.
Evidence and the METR Study
- METR study: experienced OSS devs thought AI made them faster but actually became ~19% slower.
- People see it as evidence that:
- Developers are bad at self-assessing AI productivity.
- Short-term experiments with unfamiliar tools understate potential long-term benefits.
- Noted that incentives heavily favor publishing “insane improvement” studies; the relative lack of such credible results is seen as informative.
Hype, Measurement, and Future Use
- Many criticize “evolve or die” AI marketing and inflated “70% productivity” claims; they argue it fuels backlash.
- Measuring knowledge-work productivity is seen as intrinsically hard; stats are treated with skepticism.
- Some expect that real gains will require new workflows (more planning, testing, and review around AI), organizational change, and time for “AI fluency” to develop.