Major AI conference flooded with peer reviews written by AI

Scale of AI-Generated Reviews

  • Many readers expected a higher share than 21%, finding the number “shockingly low” given incentives to offload tedious reviews.
  • Others stress that 21% fully AI‑generated reviews implies widespread dereliction of duty in a process that’s supposed to be “peer” review.

Does AI Use Matter or Only Review Quality?

  • One camp: the tool used is irrelevant; what matters is whether reviews catch errors and provide useful feedback.
  • Opposing view: even if accurate, a conference that promises peer review cannot ethically substitute an LLM for a human peer.
  • Several note common workflows where humans draft bullets and use LLMs to rewrite, translate, or polish; they argue these should not be equated with fraud.

AI Detectors and Pangram’s Claims

  • Strong skepticism toward AI detectors in general: earlier tools had high false positives, especially on non‑native English, and were easily fooled.
  • Pangram’s cofounder claims a very low false positive rate and presents benchmarks; critics find “near-zero” error rates implausible and worry about data leakage and overfitting.
  • Some see the Nature piece as PR for Pangram and emphasize that detector statistics are not “proof” for individual cases.
  • Others counter that even imperfect detectors can be useful for aggregate statistics if not used to punish individuals.

Harms and Misuse of Detection

  • Educators report “knowing” many student essays are AI‑assisted but lacking provable evidence; detectors push students to write in degraded, oversimplified styles.
  • Commenters warn that unreliable detectors create bias and witch-hunt dynamics: once content is flagged, humans start seeing “evidence” everywhere.

Broader Concerns About Peer Review and AI Slop

  • Many describe peer review as already overloaded and low-quality; AI simply lowers the effort further and expands the “market for lemons.”
  • Some fear AI’s bland, formulaic style is infecting human writing norms across the web and academia.
  • Others suggest more transparency about LLM use, reputation systems and consequences for abusive use, or even structuring conferences around AI-generated baseline reviews that humans must correct—while acknowledging these too could be gamed.