2025-05-08

Notes on rolling out Cursor and Claude Code

Ambition, DevOps, and Tooling

Several commenters echoed the “ambition unlock”: agents make previously unthinkable tooling projects (e.g., custom type inference, complex static analysis) feel feasible.
Good DevOps (fast local tests, simple commands, CI, linting/prettifying) is repeatedly cited as a force multiplier: it both helps agents work better and is easier to improve because agents can do the grunt work (fixing lint, typing, etc.).
Some note tools like Semgrep and structured API docs (e.g., llm.txt) becoming much more valuable in an agent-driven workflow.

Comments, Code Quality, and Maintainability

There’s disagreement on “ugly” agent code laden with comments.
- Some find the excessive “what this line does” comments annoying or low value and enforce “no comments except why” via prompts or rules.
- Others like the extra comments or simply strip them on review, arguing this is a minor tradeoff.
Many report that agents happily produce sprawling, unstructured code that “works” but is hard to maintain. Some see a strong correlation between code that confuses humans and code that breaks/confuses LLMs.

When and Whether to Use Agents

A recurring theme is “forgetting” to use agents, even among heavy users.
- Some interpret this as a sign the tool isn’t always a big win; when you know exactly what to write, typing it is faster than prompting.
- Others emphasize habit change, cognitive overhead of deciding to invoke the tool, and the joy/value of doing parts of the work manually.
Latency, iterative failures, and context-switching cost also push people to sometimes just code directly.

Ecosystem, Interfaces, and Costs

Alternatives and complements to Cursor/Claude Code mentioned include Aider, Plandex, JetBrains with Claude, and various CLI + Neovim setups.
Claude Code is described as a CLI coding agent that auto-loads project context and applies diffs rather than requiring copy/paste.
Token spend varies wildly: some teams see ~$50/month heavy users; others report burning ~$20/day on big refactors. Techniques to control cost include smaller contexts, using cheaper models, chunking tasks, and caching.

Safety, Reliability, and Workflow Design

Several people distrust fully agentic editing after experiences like an AI deleting half a file and replacing it with a placeholder comment.
Recommended mitigations: always operate via diffs, constrain scope, and have tools propose human-readable change plans.
Claude Code is compared to supervising a very fast but very junior dev: potentially productive with close review, disastrous if left unsupervised on larger codebases.

Non-Engineers Shipping Code

The article’s example of a head of product and PM shipping hundreds of PRs provoked strong reactions:
- Proponents say it increases dev capacity, tightens design–implementation loops, and is safe under code review and CI.
- Skeptics see it as “horrifying” or a “disaster waiting to happen,” arguing non-technical roles should focus on higher-leverage work and that this can create maintenance debt and hype-driven optics.
There’s disagreement on whether, in an AI-coding world, “non-technical” remains a meaningful category.

Capabilities, Limits, and Language Choice

Agentic review works best when rules are explicit and context is local (e.g., a GitHub Action checking Rails migrations against written guidelines). General PR review is seen as much harder.
Typed languages (TypeScript, etc.) are reported to work better with LLMs; type systems catch many AI mistakes. Dynamic languages like Ruby are described as producing more pathological outputs and runtime surprises.

Economic and Philosophical Concerns

One view is that if “anyone can ship code,” developer compensation will be pressured downward, even if full replacement doesn’t happen.
There’s a deeper dispute over what LLMs are doing:
- Critics call them “just token predictors” and liken coding agents to snake oil.
- Others counter that next-token prediction at current scales requires and exhibits nontrivial reasoning, planning, and domain modeling, which, while imperfect, is already practically useful for many coding tasks.

Related topics