Don’t let an LLM make decisions or execute business logic
Role of LLMs in Software Systems
- Strong agreement that LLMs should not execute business logic or hold authoritative state.
- Widely endorsed pattern: LLMs interpret messy human input and emit structured commands; traditional code enforces rules and performs side‑effectful actions.
- This fits with tool use / MCP: humans write APIs and constraints; models choose which tools to call and with what arguments.
Front‑End vs Back‑End and Quality Debate
- Some treat LLM-generated UI as “good enough” while hand‑coding all core logic.
- Others push back: “it’s just the UI” is a misconception; front‑end bugs can create auth, injection, and UX problems.
- Arguments that front‑ends churn more and are more fault‑tolerant, so perfect code quality is less critical; counterargument that disrespecting front‑end craft leads to brittle systems.
Experiences with LLM-Driven Apps and Games
- Reports of LLM-only interactive systems (NPCs, choose‑your‑own‑adventure, game‑adjacent products) being impressive in demos but fragile, hard to test, and hard to maintain.
- Teams often end up replacing “LLM runs everything” with orchestration code, multiple specialized prompts, and explicit state machines.
- Interesting twist: using RAG not to add knowledge but to hide facts from the model until “discovered,” to prevent spoiled puzzles.
Where LLMs Work Well
- Converting unstructured text into structured data; classification; summarization; fuzzy matching; document and UI test analysis.
- Coding assistant for repetitive edits and refactors, treating it as a “fuzzy regex engine” whose output is reviewed.
- “Vibes-based” or approximate domains: tax research, shopping and gift suggestions, content rewriting, translations, basic layout snippets.
Concerns: Reliability, Testing, and State
- LLM narratives drift over long interactions; they forget rules and state, making them unsuitable for long‑running logic.
- LLM-only systems are described as “fragile” and “a testing nightmare”; reproducibility and debugging are difficult.
- Comparison to humans: humans err too, but they learn and can be held accountable; current models repeat the same classes of mistakes.
Disagreement on Article’s Claim and Future Trajectory
- Some find the article obvious or confusing (especially around “implement vs execute logic”); others say the core message is valuable and underappreciated.
- Debate over whether LLMs will eventually handle end‑to‑end agents reliably (“cars vs horses” analogy) versus skepticism that progress will plateau sooner than boosters expect.
- Recognition that non‑technical users and “vibe coders” will keep using LLMs as full stacks anyway, creating ecosystems of “sometimes working” software.