Scientists should use AI as a tool, not an oracle
AI as Tool vs Oracle
- Strong agreement that AI (especially LLMs) should be a tool, not an authority.
- Concern that many users, including some scientists, effectively let AI “write for them” or source facts without verification.
- Some suggest formal norms like citing AI as a writing assistant to increase transparency.
Hallucinations, References, and Reliability
- Multiple reports of fabricated or misrepresented citations; one user counted ~95% fake references.
- Complaints that LLMs distort even simple factual text when asked to “rewrite” in different tones.
- Clarification that hallucination is not a “bug” in the code path but an inherent consequence of the modeling approach.
Comparison to Search and Source Evaluation
- LLMs differ from search because they strip away context, provenance, and competing answers.
- Traditional search lets users judge credibility via site, author, and links; LLMs offer a single, confident narrative.
- Some think people overestimate their ability to detect unreliable web data anyway.
Use in Science, Academia, and Public Sector
- Worry that scientists will use chatbots to interpret results or draft papers, driven by publish‑or‑perish pressures.
- A contrasting view from public‑sector science: AI assistants could help triage huge backlogs (e.g., toxicology literature, QSAR trends) if used under expert oversight.
- Serious concern about public bureaucracies replacing human checks with AI for efficiency, leading to harmful decisions.
Expertise, Trust, and Epistemology
- Broader problem: people confidently argue against domain experts while uncritically trusting machines.
- Counterpoint: some “experts” in high‑profile domains are politicized, so laypeople struggle to know whom to trust.
- Several note that more information of lower average quality worsens existing epistemic problems.
Definitions: Leakage, Overfitting, and “Curve Fitting”
- “Leakage” discussed as using information in training that would not be available at inference, often via mislabeled or improperly split data; related to but distinct from overfitting.
- Example: models learning background artifacts (e.g., trees) instead of the intended object.
- Some argue calling it “curve fitting” rather than “AI” would demystify it and clarify legal responsibility.
Intelligence, Correctness, and Anthropomorphism
- Debate over whether LLMs are “intelligent” or just probabilistic text generators.
- One side stresses they merely predict next tokens and lack concepts like truth or correctness internally.
- Others insist correctness is still a meaningful external criterion: if the output fails the user’s task, it is wrong, regardless of internal mechanics.
- Warnings that anthropomorphizing models and marketing them as oracles causes misuse and misplaced trust.
Safety, Influence, and Corporate Incentives
- Speculation about using AI to subtly steer human behavior (e.g., toward “better” choices), with ethical and trust risks.
- View that corporations will integrate imperfect AI wherever it is economically beneficial, but must surround it with validation pipelines, as in software development.
- Some skepticism toward highly speculative AI‑doomer narratives; emphasis that genuine safety work should be grounded in real technical understanding.