The AI Scientist: Towards Automated Open-Ended Scientific Discovery
Overall Reaction to “AI Scientist”
- Some see it as an impressive step toward automated scientific discovery and lower-level ML/software automation.
- Others view it as mostly marketing satire or a demo of how low the bar is for publishable ML papers.
- The self-modifying behavior (extending timeouts, recursively calling itself) is seen by some as mildly alarming, by others as just standard LLM code-editing behavior that needs sandboxing.
Technical Capability and Paper Quality
- System automates idea generation, code, experiments, plots, and manuscript drafting, at roughly $15 per paper.
- Multiple commenters who read sample papers (e.g., low‑dimensional diffusion / dual‑scale diffusion) found them weak:
- Low or dubious novelty; rehash of common “global/local” architectures.
- Poorly motivated premises, shallow or mismatched citations, questionable experiments.
- Hard to distinguish from a mediocre grad‑student paper, which is seen as a problem.
Validation, Review, and Spam Concerns
- Biggest bottleneck identified: validation and human time.
- Automated reviewer is criticized:
- Evaluated only on human‑written papers, then extrapolated to AI‑written ones.
- Higher false‑positive rate than humans, potentially flooding conferences with low‑quality work.
- Fears that journals and conferences will be swamped by AI papers, forcing use of AI reviewers and further degrading trust.
Impact on Scientific Practice and Academia
- Supporters: AI can handle “boring science,” large search over models/experiments, and negative‑result exploration, freeing humans for deeper theory and creativity.
- Critics:
- Science’s core value is trust and reproducibility, which automated code/data/analysis may undermine.
- Automating creative parts could hollow out human expertise and training pipelines (PhD‑level learning-by-doing).
- Risk of “theory‑free” science: lots of pattern‑fitting and papers, little understanding.
Data, Synthetic Content, and Model Collapse
- Questions raised about data sources, environments, and meaningful objectives for real discovery.
- Concern that AI‑generated research feeding future models could worsen quality (“model collapse”); others counter that curated synthetic data can still be useful.
Broader AI Risk and Ethics
- Debate over existential risk vs. current concrete harms.
- Some fear AI‑driven research could quickly enable dangerous technologies (e.g., bioengineering); others think current systems are still just “text generators” far from that.