2026-03-04

Agentic Engineering Patterns

Testing, harnesses, and validation

Strong consensus that agentic coding only works when there is a deterministic, executable test harness (unit tests, integration tests, browser automation, compression round-trips, etc.).
Red/green TDD is seen as especially effective: have the agent write failing tests first, verify they fail, then implement until they pass.
Several warn that LLMs often generate “tautological” or pointless tests that always pass; suggested mitigations include:
- Forcing tests to fail on a deliberately broken implementation.
- Using mutation-testing–style ideas to check tests actually detect changes.
- Being specific about edge cases and status codes, not just “write tests for X”.

How agents are being used in practice

Popular uses: boilerplate, CRUD, UI flows, landing pages, documentation, and exploring unfamiliar codebases.
Some report they “barely write code” for certain domains (typed, well-tested web backends, React-style apps) and rely heavily on plan modes and agent loops.
Others find agents still slower or too brittle, especially for math-heavy logic, ML pipelines, or unfamiliar APIs, and prefer manual coding with AI as an assistant.

Planning, specs, and state management

Many advocate a structured workflow: write or refine a spec, have the agent produce a plan, review it, then implement with checkpoints and tests.
Scratch files (markdown logs, decisions/rejections lists, AGENTS.md rules) help agents avoid re-trying failed approaches and encode constraints.
Some move these logs into structured, queryable stores to avoid context bloat and allow multiple agents to share state.

Code review, quality, and cognitive debt

Code review is emerging as the main bottleneck when code becomes “cheap.”
Concerns about huge AI-generated PRs dumped on teammates; proposed countermeasures:
- Smaller, bisect-safe patches.
- Treat agent output like junior dev work.
- Shift some review effort to designs/specs and architectural rules.
Worries about “cognitive debt” from vast amounts of AI-written code that no one truly understands; interactive explanations and better documentation are proposed partial remedies.

Skepticism, limitations, and anti-patterns

Strong pushback against hype and pattern-industrial-complex: fear of reinventing simple practices (tests, planning, small commits) under grand “agentic” branding.
Some argue many workflows overcomplicate things; a single well-instrumented agent plus good observability can beat elaborate multi-agent setups.
Anti-patterns called out: unreviewed mass PRs, relying blindly on AI-written tests, and assuming agents can replace deep domain understanding.

Organizational and ethical questions

Mixed feelings about productivity gains that let one person do the work of several; seen as a people/management problem more than a technical one.
Concerns about tools that mimic human browser behavior being used for spam, though defenders cite legitimate automation use cases.

Related topics