2025-09-26

Context is the bottleneck for coding agents now

Fine‑tuning vs “context engineering”

Some ask whether LoRA or similar fine‑tuning on a proprietary codebase could replace complex prompt/context work.
Others respond that current coding models are mostly fine‑tuned for tool use, not for embedding large private codebases as “knowledge.”
Concerns: resource cost for mid‑size companies, risk of over‑specializing and degrading general performance, and the fact that LoRA augments rather than overwrites base weights.

Codebase structure and “LLM‑compatible” design

Several people report that well‑layered, modular, documented codebases produce far better LLM output than tangled monoliths.
Some advocate microservices and strict architecture docs as a way to keep per‑task context small; others argue this prematurely increases complexity and is overkill unless scale truly demands it.
A recurring idea: deliberately refactor and document code so LLMs can work reliably in it (shorter files, clear modules, inline rationale, “don’t do X or it breaks Y” notes).

Context, memory, and hierarchical summaries

Many agree that context is also a bottleneck for humans: we operate on compressed mental summaries, not full codebases.
Proposed pattern: agents maintain hierarchical notes/summaries (repo → folder → file → function), updating them on every commit, so later tasks use summaries rather than raw code.
Others counter that human memory is qualitatively different from LLM summarization, which is lossy and brittle, but accept that simulated hierarchical memory can still be useful.

Context windows, poisoning, and sub‑agents

Large contexts often degrade performance; once an LLM “decides” on a bad direction, small corrective prompts struggle against thousands of tokens of prior reasoning (“context poisoning”).
Practical workarounds:
- Frequently clearing or compacting context; starting new chats with a hand‑crafted summary.
- Tooling that rewrites/filters history or uses sub‑agents with fresh contexts for searches, navigation, or specific subtasks.
- Agents that checkpoint plans/notes, then discard detailed history.

Real‑world experience with coding agents

Reports range from “entire PRs generated and shipped” to “only occasional help; one‑line fixes are faster by hand.”
Long‑horizon, multi‑step work still requires heavy human steering; speed of navigation and limited context are major pain points.
Some find that context limits force beneficial refactoring; others see “refactor your whole codebase so the tool works” as backwards.

Capabilities, limits, and responsibility

Several commenters dispute that “intelligence” is rapidly increasing, citing hallucinations and confident errors even on simple tasks.
Others argue that long‑term bottlenecks will be responsibility and liability: someone still must understand requirements, evaluate designs, review code, and own failures.
Broad consensus: agents resemble strong junior developers—powerful accelerators for well‑scoped tasks, but nowhere near autonomous replacements for experienced engineers.

Related topics