Claude is good at assembling blocks, but still falls apart at creating them

Perceived Capability: “Good Junior Dev,” Not Senior

  • Many compare Claude and similar tools to a competent junior developer: fast at localized tasks, but needing close review and architectural guidance.
  • Some report shipping “most of their code” with Claude (Opus 4.5), including production systems, with clear gains in velocity and bug-fixing (e.g., generating PRs from Datadog errors).
  • Others argue even a good human junior is still more capable, especially at handling ambiguity and understanding systems.

Abstraction, Architecture, and API Design

  • Strong consensus that LLMs excel at filling in details (implementing features, wiring code, refactoring with specific patterns) but struggle to invent good abstractions, APIs, or module boundaries without human direction.
  • Examples: inefficient data copying instead of rearchitecting for pointer-style sharing; poor React component design; Python code with nested ifs, mis-scoped imports, and swallowed exceptions.
  • Several note this mirrors the median human developer: most people are bad at API design and high-level abstraction anyway.

“Just Search” vs Compression and Novelty

  • One camp frames LLMs as “really good search”: semantic retrieval over training data + user code, recombining known patterns. This mental model helps set realistic expectations: great at mapping, translating, modifying; weak at truly “from scratch” creation.
  • Others call “just search” reductive, likening it to calling CPUs “just transistor states.” They emphasize:
    • LLMs act as lossy probabilistic compressors of human knowledge, synthesizing and recombining concepts.
    • Internal “circuits” and conceptual relationships can enable interpolation, limited extrapolation, and emergent reasoning-like behavior.
  • Debate over whether outputs can ever be genuinely novel vs only “novel to the user” continues, with no consensus.

Reliability, Hallucinations, and Verification

  • Experiences are highly mixed: some see large quality improvements and fewer hallucinations over time; others still hit made-up APIs, types, or misleading solutions that waste time.
  • Simple harnesses (e.g., static typing, linting, tests, formal methods) can catch many hallucinations in code, but most domains lack such verifiers.
  • A common pattern: Claude often chooses minimal or local edits, sometimes suboptimal globally; attempts to correct via CLAUDE.md–style instructions have only partial success.

Workflow, Learning, and Productivity

  • Many feel “unlocked”: able to try more ideas, run more experiments, and explore design alternatives quickly, similar to the shift from film to digital.
  • Others worry this leads to shallow thinking: quick prototyping replacing deeper internal reasoning and design “marination.”
  • On learning: some say LLMs accelerate conceptual understanding by enabling more experiments; others feel they learn little unless they deeply review and debug the generated code themselves.

Future Trajectory and Limits

  • One side sees a moving frontier: LLMs progressed from small functions to multi-file subsystems; therefore, higher-level abstraction and multi-service design may improve similarly in 6–12 months.
  • Another side argues there are hard ceilings:
    • Persistent failures at abstraction and hallucinations despite scale-ups.
    • Training on “random internet code” bakes in bad patterns; prompts can’t fully fix that.
  • No agreement on whether we’re nearing a plateau or just mid-curve.

Organizational and Economic Implications

  • Speculation ranges from “mid-level-equivalent AI would be revolutionary” to tongue-in-cheek visions of a CEO plus fleets of agents (and even questioning whether a CEO would still be needed).
  • Some foresee boards and owners still wanting a human “ringable neck” and guardian against misaligned AI-provider incentives.
  • Broad concern that another path to the middle class (junior dev work) is narrowing, even if senior design and oversight roles remain human for the foreseeable future.