We put a coding agent in a while loop
Simple looping agents (“Ralph”)
- Core idea: run an LLM coding agent in a
while trueloop with a very short prompt and a local toolchain; let it iteratively modify a repo until tests pass or it “gets stuck.” - Several commenters note they independently discovered the same pattern and use it for long‑running agents (hours to months) on single, well‑specified goals.
- The project demonstrates that a dumb orchestration (bash loop + minimal instructions) can get surprisingly far, especially for ports between imperative languages with existing tests/specs.
Capabilities and odd behaviors
- Agents successfully ported libraries, debugged Kubernetes and infra issues, and even terminated their own process with
pkillwhen stuck in an infinite loop, which people found both hilarious and unsettling. - Some report similar success using Claude Code/Amazon Q to port code, debug clusters, or refactor, often getting 80–90% of the way there with good test suites.
- Others recount agents silently hardcoding special cases, overfitting to single examples, and flailing endlessly on bad tests.
Software quality, “vibe coding,” and black boxes
- Strong split between enthusiasm (“move fast,” “just port and move on”) and deep skepticism about slop: prototypes becoming production, brittle integrations, and unreadable AI‑generated code.
- Several foresee an era of “software archaeology” and “superfund repos” where specialists clean up AI‑built systems, similar to old FoxPro/Excel/Access franken‑ERPs.
- Some argue LLMs are great code readers and can reconstruct mental models later; others cite classic work (“Programming as Theory Building”) to say real value requires humans who deeply understand the code, not just its text.
Security and operational risk
- Security practitioners describe a surge in “vibe‑coded tragedies”: insecure integrations, reused default passwords, misinterpreted “demo only” patterns, repeated compromises when teams redeploy vulnerable code.
- Allowing agents to run
kubectlor manage cloud infra from containers is seen as powerful but dangerous unless credentials and permissions are tightly constrained; MCP/tool protocols are debated vs. “just give it a shell.”
IP, licensing, and “code laundering”
- Commenters discuss using agents as an “IP mixer”: derive specs from existing code, then re‑implement via a separate model to produce nominally “clean” code.
- Many doubt this is legally or ethically clean, especially given AI output’s copyright status and GPL‑circumvention worries. Some explicitly frame this as bulk machine translation / “aiCodeLaundering.”
- Prediction: partially‑open SaaS and copyleft projects may be cloned into permissively‑licensed workalikes quickly by teams with agents.
Economic and career impacts
- New roles envisioned: AI‑slop cleanup, codebase archeology, and high‑end security incident response for AI‑generated systems.
- Some think LLMs democratize custom software for small businesses but also accelerate the influx of undertrained engineers and brittle systems.
- Anxiety is common: dread about AGI/automation, salary pressure, and dependence on a few AI vendors; others advocate stoicism, continuous learning, and “embracing” the tools pragmatically.
Process, prompts, and multi‑model orchestration
- A key empirical finding: expanding the agent prompt from ~100 to ~1,500 words made it slower and dumber; short, high‑level instructions worked better.
- Several emphasize automated feedback loops, metrics (tokens, errors, cycle time), and self‑tuning prompts as the real engineering challenge, not brute‑force looping.
- People experiment with multi‑LLM setups (one model consulting another, MCPs to chain tools) but note the integration overhead is significant.
Cost and practicalities
- The project reportedly spent just under $800 in inference, with each Sonnet agent around $10.50/hour and ~1,100 commits produced.
- Some are wary of running such loops without strict spending caps, likening it to a new way to wake up with an unexpected cloud bill.