Agentic Engineering Patterns

Testing, harnesses, and validation

  • Strong consensus that agentic coding only works when there is a deterministic, executable test harness (unit tests, integration tests, browser automation, compression round-trips, etc.).
  • Red/green TDD is seen as especially effective: have the agent write failing tests first, verify they fail, then implement until they pass.
  • Several warn that LLMs often generate “tautological” or pointless tests that always pass; suggested mitigations include:
    • Forcing tests to fail on a deliberately broken implementation.
    • Using mutation-testing–style ideas to check tests actually detect changes.
    • Being specific about edge cases and status codes, not just “write tests for X”.

How agents are being used in practice

  • Popular uses: boilerplate, CRUD, UI flows, landing pages, documentation, and exploring unfamiliar codebases.
  • Some report they “barely write code” for certain domains (typed, well-tested web backends, React-style apps) and rely heavily on plan modes and agent loops.
  • Others find agents still slower or too brittle, especially for math-heavy logic, ML pipelines, or unfamiliar APIs, and prefer manual coding with AI as an assistant.

Planning, specs, and state management

  • Many advocate a structured workflow: write or refine a spec, have the agent produce a plan, review it, then implement with checkpoints and tests.
  • Scratch files (markdown logs, decisions/rejections lists, AGENTS.md rules) help agents avoid re-trying failed approaches and encode constraints.
  • Some move these logs into structured, queryable stores to avoid context bloat and allow multiple agents to share state.

Code review, quality, and cognitive debt

  • Code review is emerging as the main bottleneck when code becomes “cheap.”
  • Concerns about huge AI-generated PRs dumped on teammates; proposed countermeasures:
    • Smaller, bisect-safe patches.
    • Treat agent output like junior dev work.
    • Shift some review effort to designs/specs and architectural rules.
  • Worries about “cognitive debt” from vast amounts of AI-written code that no one truly understands; interactive explanations and better documentation are proposed partial remedies.

Skepticism, limitations, and anti-patterns

  • Strong pushback against hype and pattern-industrial-complex: fear of reinventing simple practices (tests, planning, small commits) under grand “agentic” branding.
  • Some argue many workflows overcomplicate things; a single well-instrumented agent plus good observability can beat elaborate multi-agent setups.
  • Anti-patterns called out: unreviewed mass PRs, relying blindly on AI-written tests, and assuming agents can replace deep domain understanding.

Organizational and ethical questions

  • Mixed feelings about productivity gains that let one person do the work of several; seen as a people/management problem more than a technical one.
  • Concerns about tools that mimic human browser behavior being used for spam, though defenders cite legitimate automation use cases.