How I code with AI on a budget/free

Free and low‑cost access strategies

  • Many comments list generous free tiers: OpenAI daily free tokens, Google Gemini (AI Studio and CLI), Qwen Coder CLI, DeepSeek, GPT‑OSS, Pollinations, LLM7, and OpenRouter’s free models.
  • Tricks include: depositing small amounts on intermediaries (OpenRouter, chutes.ai) to unlock “free” model usage, using GitHub Copilot/Copilot+GitHub Models, and Jira’s Rovo Dev CLI beta.
  • Several recommend chat frontends or “multi‑provider” tools (Cherry AI, Ferdium, llmcouncil, SelectToSearch) to unify many models and accounts.

Workflows: web chat vs agentic coding tools

  • A sizeable group agrees with the article: web UIs + manual, “surgical” context selection often outperform integrated agents (Cline, Trae, Copilot, Roo, etc.) in quality and cost.
  • Others report the opposite: agentic tools with full‑repo context (Claude Code, Continue.dev, Zed, Windsurf, Amazon Q Dev) drastically reduce hallucinations and better respect project style.
  • There’s broad frustration with slow, multi‑step agents breaking flow; many prefer fast, dumb models for small diffs and completions, and reserve big models for planning or hard reasoning.
  • Several people are building or using context‑packing tools (aicodeprep‑gui, Aider, CodeWebChat, codemerger, repomix) to assemble repo snippets into prompts for web chats.

Model choices and tradeoffs

  • GLM‑4.5, Gemini 2.5 Pro, Claude Sonnet 4, GPT‑5, Qwen3‑Coder, Kimi K2, DeepSeek R1, GPT‑OSS, and Qwen‑Code 405B are repeatedly cited as strong coders on free or cheap access.
  • Opinions on Qwen and Mistral are mixed: some find them “useless” for serious dev, others say they’re fine for focused tasks or summarization. Llama 4 is largely dismissed for coding.
  • Many participants deliberately use a “big planner + smaller executor” pattern: smarter models to generate plans/prompts, cheaper ones (e.g., GPT‑4.1 via Cline) to apply edits.

Local models and fully local stacks

  • Suggestions for local coding models include small Qwen coder variants for near‑instant completions and 30B–70B models (Qwen3 Coder, DeepSeek Coder, Llama 3/70B quantized) for reasoning on GPUs with ~24 GB VRAM.
  • One detailed vision: a fully local Cursor‑like stack with Ollama for inference and a local vector DB (e.g., LEANN) for memory.
  • Pushback: current consumer‑grade local setups often can’t match large cloud models in depth, reflection, or context length, making the effort/benefit tradeoff questionable for many.

Privacy, “free” usage, and data value

  • Strong disagreement over “free”: some argue trading code and chats for model training is an acceptable price, especially for people who can’t afford subscriptions.
  • Others insist this is not free but a data‑for‑service transaction, warning about long‑term privacy, IP leakage, and “you are the product” dynamics.
  • Debate continues over whether enterprise “no‑training” promises are credible and whether legal/financial penalties actually deter large companies from misuse.
  • Several note that much code is already exposed via other SaaS tools; others reply that resignation doesn’t make the trade harmless.

Perceived complexity, productivity, and code quality

  • Some find the article’s 20‑tab, multi‑model workflow “nightmarish” and would rather just code, using LLMs only as a StackOverflow replacement or for boilerplate.
  • Others report AI rekindling their motivation by shortening the idea‑to‑prototype loop, even if the workflow is elaborate.
  • A few hope AI will push teams toward more modular, well‑documented, microservice‑like designs to fit within model context windows; others warn that without human architectural ownership, both AI‑ and human‑written systems devolve into tangled messes.

Other side topics

  • Concerns are raised about AI’s energy use; replies argue that (so far) personal transport and heating dominate, though the 2023–2025 boom changes the picture, and some call for explicit carbon pricing.
  • Multiple users critique the blog’s UX (laggy scrolling, blurry diagrams, duplicated text, wrong links); the author acknowledges it was rushed and largely an afterthought compared to the tooling itself.