If AI seems smarter, it's thanks to smarter human trainers
Focus of model improvements: compute, data, and techniques
- Several comments argue that increased compute is the main current driver of frontier model gains, enabling more experiments, larger runs, and advanced techniques (e.g., RL-style “reasoning traces,” synthetic fine‑tuning).
- Others note that human feedback is itself a training technique, and architectures, data curation, and training methods all matter alongside compute.
- There’s debate on allocation of R&D among data, architecture, training, compute, and “other,” with a strong short‑term bias toward compute.
Human trainers, data quality, and synthetic data
- The thread pushes back on the idea that “smarter trainers” alone explain progress; better architectures, techniques, and filtering (e.g., pre‑filtering junk data) are emphasized.
- Some expect the “human‑labeled data is better” argument to weaken as synthetic data and better annotation tools improve.
- A detailed anecdote from an AI trainer describes creating reasoning‑heavy benchmark questions; many “failures” of models were mundane (outdated info, tokenization issues), and models were often better than project organizers at spotting contradictions.
Capabilities, limitations, and evaluation
- Many see current AI as augmenting and redistributing human expertise rather than replacing it.
- Several complain that people fixate on failures and “hallucinations,” while others insist that occasional correctness doesn’t mean a system truly “can do” a task, given unpredictable errors.
- IQ tests and puzzle‑style word problems are discussed as highly “learnable,” not reliable measures of deep intelligence.
- Benchmarks that demand nuanced, framework‑dependent answers are seen as inherently tricky to automate.
Economic and ecosystem trends
- Foundational models are seen as on a path to commoditization; differentiation will come from domain expertise and applications (e.g., dev tools, cybersecurity).
- Some worry that giant data centers and compute budgets will make meaningful startup competition physically impossible.
- Others see large opportunity in domain‑specific tools and synthetic data generation, especially where experts can script data generators.
Ethics, bias, and law
- Multiple comments stress that AI systems inherit human flaws: racism, misogyny, and structural limitations of law and institutions.
- Removing biased training data is viewed as necessary but insufficient; many domains have no clear, perfectly mechanizable ground truth.
User data, privacy, and opt‑out
- Some users resist using free tools (e.g., chatbots) over fears their contributions will be used to train models, especially if they’re also paying.
- There’s debate about corporate promises not to train on user data: some see such assurances as reputationally binding, others as unverifiable and economically tempting to break.
- Policies differ by provider and product; what data is actually used for training remains somewhat unclear.
Human intelligence, expertise, and “hallucinations”
- Several comments argue that many humans, including degree‑holders, fail at basic critical thinking or trick questions, so AI mistakes shouldn’t be held to a higher bar than ordinary human reasoning.
- Others counter that degrees don’t equate to “smart,” and that much of human “intelligence” is accumulated cultural knowledge and search, not novel insight.
- This leads to discussion of “general intelligence” as socially defined and heavily dependent on prior learning and collaboration.
Prompting skill and user experience
- Frequent users report that prompt craftsmanship substantially affects output quality (e.g., linear structures, constrained outputs).
- Prompting is getting both “smarter” and more tedious; there’s interest in tools that automate meta‑prompting.
- Some perceive models as getting “dumber” over time, possibly due to cost‑cutting, outdated training data, or contrast with initial novelty, though this is speculative and unresolved in the thread.