Getting good results from Claude Code

Role of specs and planning

  • Many commenters echo the article: clear specs dramatically improve Claude Code’s results, especially when the human already knows how they’d implement the feature.
  • Others warn against “waterfall over-spec”: you can’t anticipate everything, so aim for a “Goldilocks” spec and revise after each implementation/test cycle.
  • Some experienced engineers argue that big up‑front specs are often wasted; they prefer the smallest working version plus rapid iteration, using Claude to speed that loop.
  • Several workflows use Claude itself to help draft, critique, and refine specs or technical designs before coding.

Workflows and best practices

  • Two broad styles emerge:
    • “Plan-first”: ideation chat → written spec/CLAUDE.md → architecture/file layout → phased implementation plans → small PR-sized changes.
    • “Micro-steps”: no big spec; ask for the next tiny change, review diffs, fix or roll back immediately.
  • People use subagents/slash commands for roles like Socratic questioning, spec drafting, critique, and dedicated TDD/test-generation agents.
  • Asking Claude to review its own code often works well, though prompts can make it overly negative or nitpicky.
  • Some treat project-level CLAUDE.md as Claude’s “long-term memory,” updated after each phase; others find heavy rule files get ignored or suffer from context rot and prefer minimal instructions.

Docs, comments, and context management

  • Debate over AI-generated documentation:
    • Pro: helps humans understand intent, aids code review, and gives future AI runs cheaper, summarized context; docstrings/examples are especially valued.
    • Con: bloats context, accelerates context rot, and often documents “what” not “why”; some prefer enforcing only high-level/module-level docs.
  • Several emphasize comments should explain “why” and design decisions; detailed specs or commit messages may be better than inline comment noise.

Where agents shine vs struggle

  • Very effective for:
    • Greenfield features, glue code, infrastructure/tests (e.g., Playwright suites), and repeating existing patterns in the codebase.
    • Prototyping ideas that otherwise wouldn’t be worth the time; exploring refactors and architecture options.
  • Struggles with:
    • Complex, legacy, or domain-heavy apps; larger refactors without strong human guidance; mis-designed architectures (e.g., building a game on React hooks).
    • Overconfident “big fixes” where it misdiagnoses a small bug and starts large rewrites; users mitigate by demanding proof (logs, debug prints, tests) before big changes.

Impact on developers and learning

  • Consensus that mid/senior developers remain crucial: they write specs, choose architectures, and correct bad decisions.
  • Concern that newcomers might skip the hard-earned skill of problem decomposition if AI always writes the code.
  • Some fear management overestimating AI and demanding unrealistic productivity multipliers.

Tooling, costs, and comparisons

  • Claude Code’s CLI/agent UX is praised for forcing more deliberate use versus IDE-embedded tools that invite ad‑hoc prompting.
  • Comparisons suggest Claude Code is generally more reliable at completing tasks than Gemini CLI and some others; Gemini is cheaper but more prone to failure loops.
  • Subscription limits are a pain point: heavy Opus use can burn through cheaper plans quickly; many recommend Sonnet for most work and reserving Opus for planning/special cases.
  • Some question whether elaborate workflows/specs truly outperform simple, incremental “fancy autocomplete” usage; experiences are mixed.