The new skill in AI is not prompting, it's context engineering

What “context engineering” is about

  • Commenters broadly agree that good results come less from “magic prompts” and more from assembling the right information, tools, and history for the model at each step.
  • Emphasis is on better context, not more: relevant documents, examples, schemas, tool descriptions, recent edits, etc., structured so the model can plausibly solve the task.
  • Several people liken this to classic software practices: specs, UX requirements, tech lead work, and environment/“bureaucracy” design rather than one-shot clever phrasing.

Prompting vs context: real distinction or rebrand?

  • One camp says this is just prompt engineering with a new name; everything is “just tokens in the context window.”
  • Others argue “prompt” (what the user types) vs “context” (system prompts, history, retrieved docs, tool metadata, agent state) is a useful conceptual split, especially for multi-step agents.
  • There’s criticism of anthropomorphizing LLMs (“like humans”) and of buzzword churn, but also the view that “prompt engineering” got trivialized as “typing into chat,” so a new term helps.

Technical issues: long contexts, tools, and agents

  • Long contexts degrade (“context rot”); models weight early tokens more, and practical accuracy often drops far before the advertised max window.
  • Techniques discussed: tool loadout (choosing small subsets of tools per step), context pruning/summarization/offloading, quarantining noisy data, and using sub‑agents to keep each context focused.
  • Some expect future models with stable huge contexts and support for thousands of tools to make many current multi-agent architectures obsolete; others note costs, latency, and token pricing will still force routing and pruning.

Skepticism, rigor, and “engineering”

  • Many complain that “context/prompt engineering” is often trial-and-error tinkering dressed up as a discipline—likened to alchemy, SEO, or WoW strategy guides.
  • Others say it becomes real engineering once you add systematic evaluations, experiments, and measurable improvements; without evals you’re just guessing.
  • Determinism is debated: in theory fixed seeds make models deterministic, but parallel floating‑point execution and sampling mean outputs often vary in practice.

Real-world experience: powerful but brittle

  • Positive reports: full plugins, Manim animations, hybrid rules+ML pipelines, and complex refactors built quickly when context is well-curated.
  • Negative reports: agents that loop, break code, or produce plausible-but-wrong answers even with rich context—leading some to revert to manual coding.
  • Overall: context matters a lot, but current models still hallucinate, fail on multi-step tasks, and require human review; how durable this “skill” is as models evolve remains unclear.