I genuinely don't understand why some people are still bullish about LLMs
Diverging experiences and expectations
- Commenters split sharply: some say LLMs are “miraculous” and daily-use tools; others say they’re useless or worse than older methods.
- A big driver of disappointment is expectations set by hype: AGI talk, “it will replace programmers/doctors/teachers,” and marketing that implies oracle-like reliability.
- Critics emphasize that in domains requiring rigor and novelty (frontier science, complex legacy systems, law, medicine) LLMs routinely hallucinate, miscite, or oversimplify in ways that make them net time-wasters.
Where LLMs work well (according to supporters)
- “Dumb and annoying” tasks: shell one-liners, CSV munging, ad‑hoc scripts, simple SQL, jq filters, YAML/Terraform, boilerplate code, email drafting, markdown/LaTeX tables.
- Transcription and translation: live captions, meeting notes, podcast summaries, extracting action items from call transcripts.
- Rapid prototyping and glue code: small web apps, dashboards, scrapers, basic API clients, internal tools that don’t need high reliability or long-term maintenance.
- Brainstorming and planning: outlines, presentation structure, candidate designs, option comparisons, research starting points, naming tradeoffs the user then evaluates.
- New “reasoning” models and long context windows are reported as genuinely useful for understanding and refactoring medium-size codebases when the user already knows what “good” looks like.
Where they often fail or are dangerous
- Scientific and academic work: fabricated papers, bogus citations, wrong publication years, confident but incorrect technical summaries.
- Deep debugging and niche domains: obscure bugs in large proprietary systems, specialized scientific subfields, unusual MPI/HPC setups.
- Customer-facing autonomy: hallucinated legal/medical advice, bogus financial analysis, unreliable support chatbots, fake jurisprudence.
- Systemically: unknown, non-stationary error rates; no clear “I don’t know” behavior; chaining agents multiplies small failure probabilities.
Tool vs. hype
- Many argue LLMs are best viewed as power tools or “overconfident junior interns”: hugely useful when outputs are cheap to verify, dangerous when treated as authorities.
- Prompting and workflow design are emerging skills; some find this empowering, others see it as friction and sunk-cost rationalization (“you’re holding it wrong”).
- Several see a real tech shift but a financial bubble: enormous capex, unclear long‑term margins, heavy investor overvaluation compared to the actual, mostly narrow productivity gains.
Broader concerns
- Worries about mass unemployment, deskilling, enshittified products (AI everywhere regardless of fit), disinformation, and environmental cost.
- Counterpoint: even modest, domain-limited productivity gains at scale could be worth hundreds of billions, so “narrow but real” usefulness is enough to justify continued bullishness on the tech (if not on current valuations).