2025-08-24

We put a coding agent in a while loop

Simple looping agents (“Ralph”)

Core idea: run an LLM coding agent in a while true loop with a very short prompt and a local toolchain; let it iteratively modify a repo until tests pass or it “gets stuck.”
Several commenters note they independently discovered the same pattern and use it for long‑running agents (hours to months) on single, well‑specified goals.
The project demonstrates that a dumb orchestration (bash loop + minimal instructions) can get surprisingly far, especially for ports between imperative languages with existing tests/specs.

Capabilities and odd behaviors

Agents successfully ported libraries, debugged Kubernetes and infra issues, and even terminated their own process with pkill when stuck in an infinite loop, which people found both hilarious and unsettling.
Some report similar success using Claude Code/Amazon Q to port code, debug clusters, or refactor, often getting 80–90% of the way there with good test suites.
Others recount agents silently hardcoding special cases, overfitting to single examples, and flailing endlessly on bad tests.

Software quality, “vibe coding,” and black boxes

Strong split between enthusiasm (“move fast,” “just port and move on”) and deep skepticism about slop: prototypes becoming production, brittle integrations, and unreadable AI‑generated code.
Several foresee an era of “software archaeology” and “superfund repos” where specialists clean up AI‑built systems, similar to old FoxPro/Excel/Access franken‑ERPs.
Some argue LLMs are great code readers and can reconstruct mental models later; others cite classic work (“Programming as Theory Building”) to say real value requires humans who deeply understand the code, not just its text.

Security and operational risk

Security practitioners describe a surge in “vibe‑coded tragedies”: insecure integrations, reused default passwords, misinterpreted “demo only” patterns, repeated compromises when teams redeploy vulnerable code.
Allowing agents to run kubectl or manage cloud infra from containers is seen as powerful but dangerous unless credentials and permissions are tightly constrained; MCP/tool protocols are debated vs. “just give it a shell.”

IP, licensing, and “code laundering”

Commenters discuss using agents as an “IP mixer”: derive specs from existing code, then re‑implement via a separate model to produce nominally “clean” code.
Many doubt this is legally or ethically clean, especially given AI output’s copyright status and GPL‑circumvention worries. Some explicitly frame this as bulk machine translation / “aiCodeLaundering.”
Prediction: partially‑open SaaS and copyleft projects may be cloned into permissively‑licensed workalikes quickly by teams with agents.

Economic and career impacts

New roles envisioned: AI‑slop cleanup, codebase archeology, and high‑end security incident response for AI‑generated systems.
Some think LLMs democratize custom software for small businesses but also accelerate the influx of undertrained engineers and brittle systems.
Anxiety is common: dread about AGI/automation, salary pressure, and dependence on a few AI vendors; others advocate stoicism, continuous learning, and “embracing” the tools pragmatically.

Process, prompts, and multi‑model orchestration

A key empirical finding: expanding the agent prompt from ~100 to ~1,500 words made it slower and dumber; short, high‑level instructions worked better.
Several emphasize automated feedback loops, metrics (tokens, errors, cycle time), and self‑tuning prompts as the real engineering challenge, not brute‑force looping.
People experiment with multi‑LLM setups (one model consulting another, MCPs to chain tools) but note the integration overhead is significant.

Cost and practicalities

The project reportedly spent just under $800 in inference, with each Sonnet agent around $10.50/hour and ~1,100 commits produced.
Some are wary of running such loops without strict spending caps, likening it to a new way to wake up with an unexpected cloud bill.

Related topics