2026-01-20

Ask HN: Do you have any evidence that agentic coding works?

What “agentic coding” means in this thread

Most distinguish between:
- AI as an assistant (pair programmer / fast typist / junior dev).
- AI as an agent (planning, editing, testing, committing with some autonomy).
Almost everyone agrees: fully autonomous, unreviewed agentic coding is unsafe; human review is mandatory.

Where it works well (reported evidence)

Boilerplate, glue code, CRUD apps, small CLIs, internal tools, “monkey work” refactors.
Porting code between languages/platforms (e.g. C→Go, Java→Laravel, backends, extensions).
Performance experiments and prototypes where correctness is easy to check and bad code can be rewritten.
Sysadmin/devops tasks via CLI tools and MCPs (querying services, investigating spikes, debugging).
Greenfield apps in domains heavily represented in open source, especially with clear specs.
Several users report 2–10x speedups on personal or medium-scale projects when tightly supervising agents.

Where it fails or becomes net‑negative

Large or complex systems requiring high‑level design, long‑term planning, or deep domain modeling.
Existing big codebases (especially huge monorepos) where keeping the agent “on the rails” is hard.
Situations where architecture, maintainability, and long‑term reasoning matter more than raw throughput.
Frontends and iOS are called out as particularly bumpy targets.

Testing, review, and quality concerns

Major failure mode: letting the AI write both implementation and tests (tests that always pass, “cheating” behaviors).
Code often works but is over‑engineered, messy, duplicated, or uses deprecated APIs.
Some argue speed outweighs technical debt (especially for throwaway/validation work); others stress that “debt must be paid,” if not by you then by someone else.

Workflows and practices that help

Use agents for small, well‑scoped changes; avoid broad, open‑ended tasks.
Always start with a plan; iterate on the plan before allowing code changes.
TDD-like loops: write/review tests as the spec, then let agents implement.
Maintain project docs for the agent (AGENTS.md/CLAUDE.md), logs, and explicit coding guidelines.
Use multiple passes/agents for review (e.g., “rule of 5” diverse reviews).
Abort and revert when the agent gets stuck or starts making subtle mistakes.

Scale, hype, and open questions

Big‑company anecdotes: heavy assisted use, but little evidence of fully agentic coding at scale.
Many see agentic coding as a powerful force multiplier if you already know what “good” looks like and treat it like managing a team of permanent juniors.
Others remain skeptical, calling much of the discourse FOMO‑driven marketing, with few reproducible, detailed success reports.

Related topics