Three Years from GPT-3 to Gemini 3

Perceived Progress and Capabilities

  • Many see Gemini 3 as a substantial step up: useful for coding, product design discussions, math help, and high-quality editing. Some report 2–3x productivity or quality gains (e.g., faster code, better emails, thesis support).
  • Others argue demos are cherry‑picked. The “PhD‑level” paper is criticized as pattern‑matching and cargo cult research rather than genuine insight.
  • Several describe the models as “competent grad student” or “intermediate dev” alternating with “raving lunatic.” You still need domain knowledge to validate outputs.

Hallucinations, Reliability, and Gell‑Mann Effect

  • Hallucinations are seen as changed, not solved: fewer obvious factual glitches, more confident, self‑justifying nonsense (invented APIs, references, or methods).
  • Users note self‑contradictory reasoning and “embarrassed” behavior when models are corrected.
  • Multiple comments liken trust in AI on unfamiliar topics to the Gell‑Mann amnesia effect: you see errors in your own field yet assume quality elsewhere.

Interfaces and UX: Text vs Voice vs Generative UI

  • Strong defense of text: high information density, easy to skim, quote, and iterate. Many power users prefer chat/CLI over video or voice.
  • Others praise voice interaction (e.g., in cars, brainstorming), but complain about overly perky personalities and slowness.
  • Some expect multimodal agents and “generative UI” (dynamic, model‑designed interfaces) to be the next big shift; others think plain textboxes, tables, and graphs will remain dominant because humans haven’t changed.

Research, Novelty, and Cognitive Atrophy

  • In math and research, models help with calculations, literature surfacing, and idea refinement, but often just regurgitate known results unless heavily guided.
  • Several argue current LLMs are “huge librarians,” structurally biased toward the most probable answer, not genuine novelty.
  • There’s concern about “neural atrophy” as people offload more thinking to AI; historical analogies to books and calculators are debated.

Coding, Agents, and Security

  • Heavy use of AI for coding: “vibecoding” entire apps, then reviewing and steering, is becoming common for some; others find the same models stubborn, context‑blind, and grifty.
  • Agentic tools that can run commands or edit files raise security concerns. Some only run them in containers/VMs; others grant full access, relying on permission prompts or YOLO attitudes.
  • Worry that we’ve regressed on basic security norms by piping proprietary code and system access into opaque third‑party models.

Economics, Education, and Jobs

  • Debate over whether the massive AI spend is exceptional versus what other sectors get, and whether it’s delivering commensurate real‑world gains.
  • Long tangent on education quality, literacy, and teacher pay: some argue we should invest in human education rather than AI; others say schooling is failing regardless of funding.
  • Developers are split between anxiety about job loss (especially for routine/CRUD work) and optimism that their individual leverage and the market for custom software will expand.