Generative AI coding tools and agents do not work for me

Perceived productivity and review costs

  • Many agree with the article: reviewing AI‑generated code thoroughly often takes as long as, or longer than, writing it yourself, especially when you feel responsible for long‑term maintenance.
  • Several see AI agents as “interns with no memory”: they never accumulate project context, so every task restarts from scratch, unlike human juniors who learn over time.
  • Some argue skeptics are effectively choosing to keep very strict review standards; AI can be faster if you relax depth of review or accept more risk.

Where AI tools shine

  • Widely cited sweet spots:
    • Boilerplate and rote code (forms, React context/providers, Terraform tags, localization strings, simple scripts).
    • Debugging: explaining stack traces, finding likely causes, writing small targeted tests.
    • Navigating unfamiliar APIs or frameworks, especially when you already know how to judge the answers.
    • Reducing typing/RSI via autocomplete and tab‑completion.
  • Many use AI heavily for personal/toy projects and prototypes, but find it far less effective on large, old, or highly coupled enterprise codebases.

Workflow, prompting, and “AI coding” as a skill

  • Supporters stress that AI coding requires new skills: writing specs, breaking work into tasks, managing context (CLAUDE.md, AGENTS.md, rules files), designing workflows (spec → plan → stepwise implementation).
  • Some run multiple agents in parallel, have AI draft specs, or use it asynchronously (let it churn on low‑priority tasks while they do other work).
  • Others find this orchestration cognitively expensive, fragile across changing models, and question how transferable these skills will be.

Quality, testing, and risk

  • Commenters note studies: mixed or no productivity gains, more bugs, and potential cognitive downsides from offloading thinking.
  • Tests are seen as necessary but insufficient: they only spot‑check behavior and can’t guarantee correctness; AI can write superficial tests but struggles with deep test design.
  • Comparisons to compilers emphasize that LLMs are non‑deterministic and much less trustworthy; you must treat their output like untrusted third‑party code.
  • Some teams successfully use multiple AI code reviewers, finding real issues, while considering agentic code generation too risky.

Learning, cognition, and juniors

  • A recurring concern: heavy reliance on AI erodes problem‑solving skills and domain understanding, especially for juniors who never learn to code without it.
  • Others counter that reading/reviewing lots of code (including AI‑generated) can sharpen skills, and that AI can be a powerful teacher if used to supplement, not replace, thinking.

Economics, access, and polarization

  • Strong divide between people reporting 5–10x speedups (especially in CRUD/frontend work) and those who see near‑zero or negative ROI on complex, architectural, or safety‑critical systems.
  • Cost is debated: some say a few hundred dollars is enough to “get good”; others note this is significant for many, and argue employers—not individuals—should fund tools.
  • Several liken the debate to historic editor/IDE wars: some expect AI to become as standard as IDEs; others think its unreliability and review burden will cap its role.