If AI seems smarter, it's thanks to smarter human trainers

Focus of model improvements: compute, data, and techniques

  • Several comments argue that increased compute is the main current driver of frontier model gains, enabling more experiments, larger runs, and advanced techniques (e.g., RL-style “reasoning traces,” synthetic fine‑tuning).
  • Others note that human feedback is itself a training technique, and architectures, data curation, and training methods all matter alongside compute.
  • There’s debate on allocation of R&D among data, architecture, training, compute, and “other,” with a strong short‑term bias toward compute.

Human trainers, data quality, and synthetic data

  • The thread pushes back on the idea that “smarter trainers” alone explain progress; better architectures, techniques, and filtering (e.g., pre‑filtering junk data) are emphasized.
  • Some expect the “human‑labeled data is better” argument to weaken as synthetic data and better annotation tools improve.
  • A detailed anecdote from an AI trainer describes creating reasoning‑heavy benchmark questions; many “failures” of models were mundane (outdated info, tokenization issues), and models were often better than project organizers at spotting contradictions.

Capabilities, limitations, and evaluation

  • Many see current AI as augmenting and redistributing human expertise rather than replacing it.
  • Several complain that people fixate on failures and “hallucinations,” while others insist that occasional correctness doesn’t mean a system truly “can do” a task, given unpredictable errors.
  • IQ tests and puzzle‑style word problems are discussed as highly “learnable,” not reliable measures of deep intelligence.
  • Benchmarks that demand nuanced, framework‑dependent answers are seen as inherently tricky to automate.

Economic and ecosystem trends

  • Foundational models are seen as on a path to commoditization; differentiation will come from domain expertise and applications (e.g., dev tools, cybersecurity).
  • Some worry that giant data centers and compute budgets will make meaningful startup competition physically impossible.
  • Others see large opportunity in domain‑specific tools and synthetic data generation, especially where experts can script data generators.

Ethics, bias, and law

  • Multiple comments stress that AI systems inherit human flaws: racism, misogyny, and structural limitations of law and institutions.
  • Removing biased training data is viewed as necessary but insufficient; many domains have no clear, perfectly mechanizable ground truth.

User data, privacy, and opt‑out

  • Some users resist using free tools (e.g., chatbots) over fears their contributions will be used to train models, especially if they’re also paying.
  • There’s debate about corporate promises not to train on user data: some see such assurances as reputationally binding, others as unverifiable and economically tempting to break.
  • Policies differ by provider and product; what data is actually used for training remains somewhat unclear.

Human intelligence, expertise, and “hallucinations”

  • Several comments argue that many humans, including degree‑holders, fail at basic critical thinking or trick questions, so AI mistakes shouldn’t be held to a higher bar than ordinary human reasoning.
  • Others counter that degrees don’t equate to “smart,” and that much of human “intelligence” is accumulated cultural knowledge and search, not novel insight.
  • This leads to discussion of “general intelligence” as socially defined and heavily dependent on prior learning and collaboration.

Prompting skill and user experience

  • Frequent users report that prompt craftsmanship substantially affects output quality (e.g., linear structures, constrained outputs).
  • Prompting is getting both “smarter” and more tedious; there’s interest in tools that automate meta‑prompting.
  • Some perceive models as getting “dumber” over time, possibly due to cost‑cutting, outdated training data, or contrast with initial novelty, though this is speculative and unresolved in the thread.