AI coding at home without going broke
Cost of AI Coding at Home
- Many say $20–$100/month subs (Claude, Gemini, Cursor, etc.) are enough for serious side projects if you stay engaged and review code.
- Others report burning through $200+ plans and even multiple subs, especially with agentic workflows, automations, or long unattended runs.
- API pricing vs subscriptions: fixed-price plans are heavily subsidized relative to list API rates; some push them to the limit, effectively getting thousands of dollars of tokens for a few hundred in fees.
DeepSeek and Middlemen
- DeepSeek V4 Flash/Pro frequently cited as “cheat code”:
1–2 orders of magnitude cheaper than frontier US APIs for “80–95%” of the quality in coding tasks. - Direct DeepSeek API is much cheaper than most OpenRouter providers; caching is a major cost saver, and some claim OpenRouter routing/headers hurt cache rates.
- Some prefer Opencode Go or other harnesses that bundle DeepSeek for a flat fee; others argue direct API is still cheaper.
Cloud vs Self‑Hosting
- Self-hosting seen as mainly a privacy play; hardware (high‑VRAM GPUs, DGX Spark, Halo/Stryx, etc.) is expensive, and electricity is not free.
- Back‑of‑the‑envelope energy comparisons: humans are metabolically efficient; but “human + LLM” can save time and thus overall resource use in some views.
- Older or “free” GPUs (e.g., 1080 Ti) can run mid-size models cheaply, but often aren’t clearly cheaper than ultra‑cheap hosted models like DeepSeek Flash.
Local Models and Capability
- Consensus: nothing truly at Opus‑level locally yet; best local setups reach somewhere near Sonnet‑tier with expensive multi‑GPU rigs.
- Many are satisfied with Qwen 3.x, Gemma 4, and similar 26–35B models on 24–128 GB RAM for non‑“vibe coding” tasks, especially when used in tight, function‑level prompts.
Token Usage Patterns
- Heavy spend often comes from:
- Long “plan mode” sessions and huge contexts.
- Many tools/skills/MCPs loaded every turn.
- Autonomous agents grinding on poorly scoped tasks, refactors, or reverse engineering.
- Lower spend correlates with:
- Short, focused sessions; frequent restarts.
- Clear specs, smaller tasks, and local tools (tests, search, embeddings) to reduce context.
- Using cheaper models for routine coding and reserving premium models for analysis/architecture.
Emotional and Career Impact
- Some long‑time “craft” developers express grief and burnout: feeling displaced by agents, loss of code-as-art, and fear of becoming mere “caretakers of machines.”
- Others argue the pendulum will swing back toward product thinking and human‑centric design; AI remains a tool, not a full replacement.
Privacy, Jurisdiction, and Alternatives
- Some avoid US middlemen, preferring EU‑only providers or local inference for GDPR/privacy reasons.
- There’s skepticism that paying for cloud automatically guarantees privacy.
Future Outlook
- Several expect home‑runnable models to reach today’s mid‑frontier level within a few years and advise delaying big hardware buys.
- Others warn that future chips/models may be locked down or geopolitically constrained, making local autonomy uncertain.