2026-03-02

Parallel coding agents with tmux and Markdown specs

Perceived productivity vs. “where’s the software?”

Some are skeptical: if multi-agent setups are so productive, they ask why we don’t see lots of clearly “great” AI-built software.
Others argue it’s early: tools only got good recently, feedback cycles for software quality are long, and much of the output is internal tools or personal projects.
Several note that AI mostly increases volume of “mundane” but useful software, not necessarily greatness; average quality may even drop as volume rises.

Reported use cases and concrete wins

Many describe internal or personal tools: finance and chat apps, window compositors, status dashboards, automation scripts, filesystem/Ansible helpers, browser automation, CI tooling, ZFS backends, games, and reverse‑engineered docs for legacy systems.
One user claims a ~20%+ reduction in PR time-to-merge via parallel review agents, but this is challenged as over-extrapolated and not business‑proven.
Others mention multi-agent code review at large companies and production-grade products built on agentic coding.

Orchestration patterns: tmux, specs, worktrees

Common pattern: one markdown spec per agent/pane; agents work in parallel against git worktrees or repo copies to avoid clashes.
Some prefer 2–3 focused agents (backend/frontend/tests) over 6–8, citing merge conflicts and cognitive overhead.
Others build higher-level “factory” abstractions with a supervisor agent that decomposes work, spawns workers, and manages worktrees/merges.

Context management and documentation

Major challenge is context drift across sessions; solutions include:
- Per-agent spec docs plus an orchestration doc.
- Tools like agent-doc, Beads, or entity-centered “NERDs” documents.
- PROJECT.md / SPEC.md to record direction, key decisions, and avoid scope creep.

Costs, quotas, and optimization

Parallel agents rapidly exhaust top-tier subscriptions; several hit weekly limits in 3–4 days.
Strategies: mix cheaper models, “oracle” agents for code questions, cheap supervisors to detect spec gaps, and strong planning/checkpointing to reduce wasted “thinking” tokens.

Quality, validation, and safety

Strong emphasis on tests, verification commands, and sometimes separate reviewer agents.
Concern about agents bypassing deny lists; some invert the model and require “proof of safety” (explicit intents, path checks, diffs, tests) before any action.
Long-term impact on maintainability and design quality is seen as unclear and unmeasured.

Related topics