Notes on rolling out Cursor and Claude Code
Ambition, DevOps, and Tooling
- Several commenters echoed the “ambition unlock”: agents make previously unthinkable tooling projects (e.g., custom type inference, complex static analysis) feel feasible.
- Good DevOps (fast local tests, simple commands, CI, linting/prettifying) is repeatedly cited as a force multiplier: it both helps agents work better and is easier to improve because agents can do the grunt work (fixing lint, typing, etc.).
- Some note tools like Semgrep and structured API docs (e.g.,
llm.txt) becoming much more valuable in an agent-driven workflow.
Comments, Code Quality, and Maintainability
- There’s disagreement on “ugly” agent code laden with comments.
- Some find the excessive “what this line does” comments annoying or low value and enforce “no comments except why” via prompts or rules.
- Others like the extra comments or simply strip them on review, arguing this is a minor tradeoff.
- Many report that agents happily produce sprawling, unstructured code that “works” but is hard to maintain. Some see a strong correlation between code that confuses humans and code that breaks/confuses LLMs.
When and Whether to Use Agents
- A recurring theme is “forgetting” to use agents, even among heavy users.
- Some interpret this as a sign the tool isn’t always a big win; when you know exactly what to write, typing it is faster than prompting.
- Others emphasize habit change, cognitive overhead of deciding to invoke the tool, and the joy/value of doing parts of the work manually.
- Latency, iterative failures, and context-switching cost also push people to sometimes just code directly.
Ecosystem, Interfaces, and Costs
- Alternatives and complements to Cursor/Claude Code mentioned include Aider, Plandex, JetBrains with Claude, and various CLI + Neovim setups.
- Claude Code is described as a CLI coding agent that auto-loads project context and applies diffs rather than requiring copy/paste.
- Token spend varies wildly: some teams see ~$50/month heavy users; others report burning ~$20/day on big refactors. Techniques to control cost include smaller contexts, using cheaper models, chunking tasks, and caching.
Safety, Reliability, and Workflow Design
- Several people distrust fully agentic editing after experiences like an AI deleting half a file and replacing it with a placeholder comment.
- Recommended mitigations: always operate via diffs, constrain scope, and have tools propose human-readable change plans.
- Claude Code is compared to supervising a very fast but very junior dev: potentially productive with close review, disastrous if left unsupervised on larger codebases.
Non-Engineers Shipping Code
- The article’s example of a head of product and PM shipping hundreds of PRs provoked strong reactions:
- Proponents say it increases dev capacity, tightens design–implementation loops, and is safe under code review and CI.
- Skeptics see it as “horrifying” or a “disaster waiting to happen,” arguing non-technical roles should focus on higher-leverage work and that this can create maintenance debt and hype-driven optics.
- There’s disagreement on whether, in an AI-coding world, “non-technical” remains a meaningful category.
Capabilities, Limits, and Language Choice
- Agentic review works best when rules are explicit and context is local (e.g., a GitHub Action checking Rails migrations against written guidelines). General PR review is seen as much harder.
- Typed languages (TypeScript, etc.) are reported to work better with LLMs; type systems catch many AI mistakes. Dynamic languages like Ruby are described as producing more pathological outputs and runtime surprises.
Economic and Philosophical Concerns
- One view is that if “anyone can ship code,” developer compensation will be pressured downward, even if full replacement doesn’t happen.
- There’s a deeper dispute over what LLMs are doing:
- Critics call them “just token predictors” and liken coding agents to snake oil.
- Others counter that next-token prediction at current scales requires and exhibits nontrivial reasoning, planning, and domain modeling, which, while imperfect, is already practically useful for many coding tasks.