Claude Sonnet 4 now supports 1M tokens of context

Initial Reactions & Availability

  • Many are excited about 1M context, especially for large codebases, long documents, and multi-hour “agentic” sessions.
  • Others note it’s API-only (and initially only on higher tiers / specific providers); web UI and non-Max Claude Code users don’t get it yet.
  • Some users report enabling 1M in Claude Code via undocumented headers/env vars; others see staggered rollout and confusion about what’s actually live.

Impact on Coding Workflows

  • Big theme: long context helps most at init time (load large repo, specs, prior discussion) but can hurt if you just “dump everything” and let the agent wander.
  • Multiple workflows are shared:
    • Spec-first: write feature/requirements docs, then a plan, then implement in small stages, resetting context between stages.
    • Using project files like CLAUDE.md, status.md, map.md, .plan to track decisions, progress, and give the model a compact, durable “cursor” into the codebase.
    • Frequent commits and using tools (git worktrees, MCP servers, repomaps, Serena, etc.) so the model searches instead of loading entire files.
  • Some prefer manual chat + editor over full agents; others lean heavily on Claude Code / Cursor.

Context Rot, Retrieval & Limits

  • Several link to “context rot” research: performance often degrades as context grows; needle-in-haystack benchmarks are not representative of real reasoning.
  • Reports that models (including past Gemini long-context versions) can technically accept huge inputs but start “forgetting” or ignoring earlier parts after tens of thousands of tokens.
  • Strong sentiment that abstractions + retrieval (RAG, language servers, outlines, repomaps) matter more than raw context size.

Claude vs Competitors & Pricing

  • Gemini 2.5 Pro is widely praised for long-context code and document understanding, and is cheaper per token, but availability and QoS are pain points.
  • Claude is preferred by many for safety, consistency, prose quality, and Claude Code’s workflow; others find Gemini or GPT‑5 superior on their stacks.
  • Anthropic’s 1M pricing is seen as steep but defensible for high-value use; caching discounts matter. Some fear surprise bills if agents routinely sit in the “expensive band”.

Productivity & “Agentic AI” Debate

  • Experiences are polarized: some claim 2–3×+ productivity on web/full‑stack work; others say agentic tools are net negative, citing thrash, hallucinations, and review overhead.
  • Nuanced consensus:
    • Best gains come on new tech, boilerplate-heavy work, or for juniors.
    • Senior devs in complex, bespoke systems often see smaller or negative returns.
    • Technique (planning, tight scopes, context hygiene) matters at least as much as model choice.