2025-06-17

Generative AI coding tools and agents do not work for me

Perceived productivity and review costs

Many agree with the article: reviewing AI‑generated code thoroughly often takes as long as, or longer than, writing it yourself, especially when you feel responsible for long‑term maintenance.
Several see AI agents as “interns with no memory”: they never accumulate project context, so every task restarts from scratch, unlike human juniors who learn over time.
Some argue skeptics are effectively choosing to keep very strict review standards; AI can be faster if you relax depth of review or accept more risk.

Where AI tools shine

Widely cited sweet spots:
- Boilerplate and rote code (forms, React context/providers, Terraform tags, localization strings, simple scripts).
- Debugging: explaining stack traces, finding likely causes, writing small targeted tests.
- Navigating unfamiliar APIs or frameworks, especially when you already know how to judge the answers.
- Reducing typing/RSI via autocomplete and tab‑completion.
Many use AI heavily for personal/toy projects and prototypes, but find it far less effective on large, old, or highly coupled enterprise codebases.

Workflow, prompting, and “AI coding” as a skill

Supporters stress that AI coding requires new skills: writing specs, breaking work into tasks, managing context (CLAUDE.md, AGENTS.md, rules files), designing workflows (spec → plan → stepwise implementation).
Some run multiple agents in parallel, have AI draft specs, or use it asynchronously (let it churn on low‑priority tasks while they do other work).
Others find this orchestration cognitively expensive, fragile across changing models, and question how transferable these skills will be.

Quality, testing, and risk

Commenters note studies: mixed or no productivity gains, more bugs, and potential cognitive downsides from offloading thinking.
Tests are seen as necessary but insufficient: they only spot‑check behavior and can’t guarantee correctness; AI can write superficial tests but struggles with deep test design.
Comparisons to compilers emphasize that LLMs are non‑deterministic and much less trustworthy; you must treat their output like untrusted third‑party code.
Some teams successfully use multiple AI code reviewers, finding real issues, while considering agentic code generation too risky.

Learning, cognition, and juniors

A recurring concern: heavy reliance on AI erodes problem‑solving skills and domain understanding, especially for juniors who never learn to code without it.
Others counter that reading/reviewing lots of code (including AI‑generated) can sharpen skills, and that AI can be a powerful teacher if used to supplement, not replace, thinking.

Economics, access, and polarization

Strong divide between people reporting 5–10x speedups (especially in CRUD/frontend work) and those who see near‑zero or negative ROI on complex, architectural, or safety‑critical systems.
Cost is debated: some say a few hundred dollars is enough to “get good”; others note this is significant for many, and argue employers—not individuals—should fund tools.
Several liken the debate to historic editor/IDE wars: some expect AI to become as standard as IDEs; others think its unreliability and review burden will cap its role.

Related topics