How to code Claude Code in 200 lines of code

Core idea: agent = LLM + tools + loop

  • Many commenters agree the article accurately captures the conceptual core: a while-loop where the LLM chooses tools, the harness runs them, and results go back into context.
  • Several minimal examples are shared (tens of lines in Bash, JS, PHP, Python) to show how small a usable loop can be.
  • The post is compared to earlier “how to build an agent” pieces that made the same “emperor has no clothes” point.

Where real Claude Code diverges

  • Multiple people say the article is now out of date: current Claude Code has parallel subagents, hooks, skills, improved planning, TODO/task management, and more sophisticated context handling.
  • There’s internal plumbing not visible from the loop: UUID-threaded histories, message queues, file-history snapshots, subagent side-chains, queuing of tool calls, etc.
  • Some describe Claude Code as closer to a RL‑trained conductor/orchestrator than a 200‑line script.

Harness vs model quality

  • One camp argues model improvements (e.g., newer Claude Opus vs earlier Sonnet) dominate; simple harnesses like mini-swe-agent can match or beat fancy ones if the model is strong.
  • Another camp says harness details matter a lot in practice: UX, planning, skills, approvals, context pruning, and parallelization can make a weaker model plus good harness competitive for many tasks.
  • Benchmarks and anecdotal comparisons suggest large quality gaps between model generations that no harness can fully erase.

Planning, TODOs, and “early stopping”

  • A recurring pain point is premature task completion: the model stops after a few steps and declares “done.”
  • Claude Code’s TODO/task tools, repeatedly injected into prompts and kept at the top of context, are cited as a key mitigation; experiments show disabling them significantly degrades performance.
  • People describe custom variants: persistent “plan.md” files, working-memory files, DSLs for task termination, and “nudges” when the model forgets to call tools.

Production complexity, safety, and skepticism

  • Practitioners building large-scale agents emphasize edge cases: user messages during active loops, Slack/webhook integration, approvals, error handling, structured decoding, and resuming async tasks.
  • Some liken the article to “Twitter in 200 lines”: educational but glossing over the bulk of real-world complexity.
  • Concerns are raised about agents’ broad filesystem access and the risks of running them unsandboxed.