2025-03-27

I genuinely don't understand why some people are still bullish about LLMs

Diverging experiences and expectations

Commenters split sharply: some say LLMs are “miraculous” and daily-use tools; others say they’re useless or worse than older methods.
A big driver of disappointment is expectations set by hype: AGI talk, “it will replace programmers/doctors/teachers,” and marketing that implies oracle-like reliability.
Critics emphasize that in domains requiring rigor and novelty (frontier science, complex legacy systems, law, medicine) LLMs routinely hallucinate, miscite, or oversimplify in ways that make them net time-wasters.

Where LLMs work well (according to supporters)

“Dumb and annoying” tasks: shell one-liners, CSV munging, ad‑hoc scripts, simple SQL, jq filters, YAML/Terraform, boilerplate code, email drafting, markdown/LaTeX tables.
Transcription and translation: live captions, meeting notes, podcast summaries, extracting action items from call transcripts.
Rapid prototyping and glue code: small web apps, dashboards, scrapers, basic API clients, internal tools that don’t need high reliability or long-term maintenance.
Brainstorming and planning: outlines, presentation structure, candidate designs, option comparisons, research starting points, naming tradeoffs the user then evaluates.
New “reasoning” models and long context windows are reported as genuinely useful for understanding and refactoring medium-size codebases when the user already knows what “good” looks like.

Where they often fail or are dangerous

Scientific and academic work: fabricated papers, bogus citations, wrong publication years, confident but incorrect technical summaries.
Deep debugging and niche domains: obscure bugs in large proprietary systems, specialized scientific subfields, unusual MPI/HPC setups.
Customer-facing autonomy: hallucinated legal/medical advice, bogus financial analysis, unreliable support chatbots, fake jurisprudence.
Systemically: unknown, non-stationary error rates; no clear “I don’t know” behavior; chaining agents multiplies small failure probabilities.

Tool vs. hype

Many argue LLMs are best viewed as power tools or “overconfident junior interns”: hugely useful when outputs are cheap to verify, dangerous when treated as authorities.
Prompting and workflow design are emerging skills; some find this empowering, others see it as friction and sunk-cost rationalization (“you’re holding it wrong”).
Several see a real tech shift but a financial bubble: enormous capex, unclear long‑term margins, heavy investor overvaluation compared to the actual, mostly narrow productivity gains.

Broader concerns

Worries about mass unemployment, deskilling, enshittified products (AI everywhere regardless of fit), disinformation, and environmental cost.
Counterpoint: even modest, domain-limited productivity gains at scale could be worth hundreds of billions, so “narrow but real” usefulness is enough to justify continued bullishness on the tech (if not on current valuations).

Related topics