Ask HN: Do you have any evidence that agentic coding works?
What “agentic coding” means in this thread
- Most distinguish between:
- AI as an assistant (pair programmer / fast typist / junior dev).
- AI as an agent (planning, editing, testing, committing with some autonomy).
- Almost everyone agrees: fully autonomous, unreviewed agentic coding is unsafe; human review is mandatory.
Where it works well (reported evidence)
- Boilerplate, glue code, CRUD apps, small CLIs, internal tools, “monkey work” refactors.
- Porting code between languages/platforms (e.g. C→Go, Java→Laravel, backends, extensions).
- Performance experiments and prototypes where correctness is easy to check and bad code can be rewritten.
- Sysadmin/devops tasks via CLI tools and MCPs (querying services, investigating spikes, debugging).
- Greenfield apps in domains heavily represented in open source, especially with clear specs.
- Several users report 2–10x speedups on personal or medium-scale projects when tightly supervising agents.
Where it fails or becomes net‑negative
- Large or complex systems requiring high‑level design, long‑term planning, or deep domain modeling.
- Existing big codebases (especially huge monorepos) where keeping the agent “on the rails” is hard.
- Situations where architecture, maintainability, and long‑term reasoning matter more than raw throughput.
- Frontends and iOS are called out as particularly bumpy targets.
Testing, review, and quality concerns
- Major failure mode: letting the AI write both implementation and tests (tests that always pass, “cheating” behaviors).
- Code often works but is over‑engineered, messy, duplicated, or uses deprecated APIs.
- Some argue speed outweighs technical debt (especially for throwaway/validation work); others stress that “debt must be paid,” if not by you then by someone else.
Workflows and practices that help
- Use agents for small, well‑scoped changes; avoid broad, open‑ended tasks.
- Always start with a plan; iterate on the plan before allowing code changes.
- TDD-like loops: write/review tests as the spec, then let agents implement.
- Maintain project docs for the agent (AGENTS.md/CLAUDE.md), logs, and explicit coding guidelines.
- Use multiple passes/agents for review (e.g., “rule of 5” diverse reviews).
- Abort and revert when the agent gets stuck or starts making subtle mistakes.
Scale, hype, and open questions
- Big‑company anecdotes: heavy assisted use, but little evidence of fully agentic coding at scale.
- Many see agentic coding as a powerful force multiplier if you already know what “good” looks like and treat it like managing a team of permanent juniors.
- Others remain skeptical, calling much of the discourse FOMO‑driven marketing, with few reproducible, detailed success reports.