Don’t let an LLM make decisions or execute business logic

Role of LLMs in Software Systems

  • Strong agreement that LLMs should not execute business logic or hold authoritative state.
  • Widely endorsed pattern: LLMs interpret messy human input and emit structured commands; traditional code enforces rules and performs side‑effectful actions.
  • This fits with tool use / MCP: humans write APIs and constraints; models choose which tools to call and with what arguments.

Front‑End vs Back‑End and Quality Debate

  • Some treat LLM-generated UI as “good enough” while hand‑coding all core logic.
  • Others push back: “it’s just the UI” is a misconception; front‑end bugs can create auth, injection, and UX problems.
  • Arguments that front‑ends churn more and are more fault‑tolerant, so perfect code quality is less critical; counterargument that disrespecting front‑end craft leads to brittle systems.

Experiences with LLM-Driven Apps and Games

  • Reports of LLM-only interactive systems (NPCs, choose‑your‑own‑adventure, game‑adjacent products) being impressive in demos but fragile, hard to test, and hard to maintain.
  • Teams often end up replacing “LLM runs everything” with orchestration code, multiple specialized prompts, and explicit state machines.
  • Interesting twist: using RAG not to add knowledge but to hide facts from the model until “discovered,” to prevent spoiled puzzles.

Where LLMs Work Well

  • Converting unstructured text into structured data; classification; summarization; fuzzy matching; document and UI test analysis.
  • Coding assistant for repetitive edits and refactors, treating it as a “fuzzy regex engine” whose output is reviewed.
  • “Vibes-based” or approximate domains: tax research, shopping and gift suggestions, content rewriting, translations, basic layout snippets.

Concerns: Reliability, Testing, and State

  • LLM narratives drift over long interactions; they forget rules and state, making them unsuitable for long‑running logic.
  • LLM-only systems are described as “fragile” and “a testing nightmare”; reproducibility and debugging are difficult.
  • Comparison to humans: humans err too, but they learn and can be held accountable; current models repeat the same classes of mistakes.

Disagreement on Article’s Claim and Future Trajectory

  • Some find the article obvious or confusing (especially around “implement vs execute logic”); others say the core message is valuable and underappreciated.
  • Debate over whether LLMs will eventually handle end‑to‑end agents reliably (“cars vs horses” analogy) versus skepticism that progress will plateau sooner than boosters expect.
  • Recognition that non‑technical users and “vibe coders” will keep using LLMs as full stacks anyway, creating ecosystems of “sometimes working” software.