Levels of Agentic Engineering
Framing the “levels” model
- Several commenters dislike the ladder framing; it implies “higher = better” and encourages gatekeeping and toxicity.
- Some see the “levels” more as historical stages in the AI tooling ecosystem than as a personal skill ladder.
- Alternative taxonomies (e.g., car-autonomy-inspired, simpler 2–5 level schemes) are mentioned as cleaner for communication.
- A minimalist view: only two real modes – human-with-AI-assist vs AI-with-human-assist – with jokes about “AI with AI assist.”
Autonomous agents and “dark factories”
- Curiosity and skepticism around fully autonomous “software factories” that generate large codebases with minimal human input.
- Key challenge raised: if software can be fully delegated, why not sell the factory itself? Others reply that we’re not there yet, and that sales, marketing, and market fit remain unsolved by LLMs.
- Some expect such factories to disrupt or “kill” much of traditional enterprise software; others argue internal enterprise software and regulatory checks will still demand human oversight.
Validation, quality, and context limits
- Multiple comments argue that the real bottleneck is validation, not orchestration: producing 100× more code without 100× more validation harms quality.
- Flaky tests, regulatory constraints, and subtle bugs (e.g., data persistence, crypto correctness) are cited as current blockers to full autonomy.
- Long-running agents hit “context rot” and re-discover work; file-based persistent state and specs are proposed as pragmatic mitigations.
Capturing project knowledge and context
- Strong focus on “context engineering”: CLAUDE.md-style rules, skills, ADRs, design docs, and structured commit messages.
- Big gap identified between encoding what was done vs why; several patterns suggested (ADRs, contextual commits, typed prompt blocks).
- Consensus that structured constraints and schemas significantly improve reliability over free-form instructions.
Real-world usage patterns and ergonomics
- Reported successful setups: CI-based code review agents, microbenchmarking/performance agents, background harnesses, and manual triggering of “factories” for specific processes.
- Multi-agent teams are powerful for some, but criticized for poor dev experience, high token burn, and fragile permission management.
- Many developers still operate at “copy-paste into chat” or simple Chat IDE/CLI levels and find that effective and safer.
Human bottlenecks, communication, and hype
- As agents get stronger, the bottleneck shifts from “how to build” to “what to build,” sequencing, and articulating requirements.
- Some see voice as a useful way to dump rich context; others strongly prefer deliberate writing.
- There is substantial skepticism about hype, money-making claims, and very high “levels”; several commenters report that LLMs are often “just” a much better search/autocomplete rather than a true dark factory today.