2026-02-25

LLM=True

Token usage, verbosity, and cost pain

Many commenters report that dev agents waste huge numbers of tokens on build/test logs, diffs, and over-eager “just to be sure” steps.
This hits both context limits (LLMs get confused or “goldfish” after compaction) and wallet limits, especially on multi‑agent workflows or long test suites.
Some think users mainly care about context cleanliness; others emphasize hard token caps on paid plans.

LLM=true vs alternative mechanisms

The proposed LLM=true env var is seen as a clean way for tools to emit concise, machine-oriented output.
Critics argue this is just a special case of verbosity control; a standardized quiet/verbose or “batch/concise” mode would be more general and human‑useful.
Several suggest better names like AGENT, DEV_MODE=agent, or CONCISE=1 to avoid tying it to today’s LLM branding.

Wrappers, subagents, and caching

Popular workaround: use sub‑agents or “runner” helpers on cheaper models that run commands, summarize logs, and only feed essentials back to the main model.
Others write wrapper scripts (for gradle, npm, long test suites) that:
- Redirect full logs to files.
- Emit only summaries, error lines, or stack traces.
- Deduplicate repetitive messages.
- Expose log paths for later inspection.
Tools like chronic and homegrown logging shims play a similar role: no output on success, full dump on failure.

Overlap with human developer experience

Several note that what helps LLMs (less noise, structured logs, predictable flags) also helps humans.
Complaints extend to config proliferation and unreadable CLI conventions; suggestions include:
- Minimal configurations with good comments.
- Avoiding over‑tooled stacks when unnecessary.
- Using AI to manage configs and logging setup.

Skepticism, long-term view, and system effects

Some see the whole thing as overkill to automate trivial commands, arguing agents should be reserved for tasks that really benefit.
Others say modifying every CLI for LLM=true is unrealistic; agent frameworks should instead decide which outputs enter context and cache the rest.
A few doubt the environmental argument, invoking rebound effects: efficiency may just encourage more LLM usage.
There is debate over whether future models (different architectures, better context management) will make such tooling changes obsolete.

Related topics