2025-08-08

Getting good results from Claude Code

Role of specs and planning

Many commenters echo the article: clear specs dramatically improve Claude Code’s results, especially when the human already knows how they’d implement the feature.
Others warn against “waterfall over-spec”: you can’t anticipate everything, so aim for a “Goldilocks” spec and revise after each implementation/test cycle.
Some experienced engineers argue that big up‑front specs are often wasted; they prefer the smallest working version plus rapid iteration, using Claude to speed that loop.
Several workflows use Claude itself to help draft, critique, and refine specs or technical designs before coding.

Workflows and best practices

Two broad styles emerge:
- “Plan-first”: ideation chat → written spec/CLAUDE.md → architecture/file layout → phased implementation plans → small PR-sized changes.
- “Micro-steps”: no big spec; ask for the next tiny change, review diffs, fix or roll back immediately.
People use subagents/slash commands for roles like Socratic questioning, spec drafting, critique, and dedicated TDD/test-generation agents.
Asking Claude to review its own code often works well, though prompts can make it overly negative or nitpicky.
Some treat project-level CLAUDE.md as Claude’s “long-term memory,” updated after each phase; others find heavy rule files get ignored or suffer from context rot and prefer minimal instructions.

Docs, comments, and context management

Debate over AI-generated documentation:
- Pro: helps humans understand intent, aids code review, and gives future AI runs cheaper, summarized context; docstrings/examples are especially valued.
- Con: bloats context, accelerates context rot, and often documents “what” not “why”; some prefer enforcing only high-level/module-level docs.
Several emphasize comments should explain “why” and design decisions; detailed specs or commit messages may be better than inline comment noise.

Where agents shine vs struggle

Very effective for:
- Greenfield features, glue code, infrastructure/tests (e.g., Playwright suites), and repeating existing patterns in the codebase.
- Prototyping ideas that otherwise wouldn’t be worth the time; exploring refactors and architecture options.
Struggles with:
- Complex, legacy, or domain-heavy apps; larger refactors without strong human guidance; mis-designed architectures (e.g., building a game on React hooks).
- Overconfident “big fixes” where it misdiagnoses a small bug and starts large rewrites; users mitigate by demanding proof (logs, debug prints, tests) before big changes.

Impact on developers and learning

Consensus that mid/senior developers remain crucial: they write specs, choose architectures, and correct bad decisions.
Concern that newcomers might skip the hard-earned skill of problem decomposition if AI always writes the code.
Some fear management overestimating AI and demanding unrealistic productivity multipliers.

Tooling, costs, and comparisons

Claude Code’s CLI/agent UX is praised for forcing more deliberate use versus IDE-embedded tools that invite ad‑hoc prompting.
Comparisons suggest Claude Code is generally more reliable at completing tasks than Gemini CLI and some others; Gemini is cheaper but more prone to failure loops.
Subscription limits are a pain point: heavy Opus use can burn through cheaper plans quickly; many recommend Sonnet for most work and reserving Opus for planning/special cases.
Some question whether elaborate workflows/specs truly outperform simple, incremental “fancy autocomplete” usage; experiences are mixed.

Related topics