2% of ICML papers desk rejected because the authors used LLM in their reviews

Detection method & scope

  • Many praise the hidden watermark / prompt-injection scheme in PDFs as clever, precise, and far more reliable than generic “AI detectors.”
  • It only flags reviewers who fed the full PDF to an LLM and pasted output verbatim, not those who used LLMs for light editing or idea support.
  • Several point out that the ~2% headline rate is very conservative; actual LLM use in both “no-LLM” (Policy A) and “LLM-allowed” (Policy B) groups is likely much higher.

Ethics, dishonesty, and dependency

  • Strong consensus that the core issue is not LLM use per se but breaking an explicitly chosen “no-LLM” commitment.
  • Some frame this as straightforward cheating and lying; others emphasize human weakness and “impulse control,” likening LLM reliance to addiction.
  • A few describe personal strategies (separate machines, blocking paste) to avoid LLM contamination of professional writing.

Debate over sanctions

  • Opinions range from “ban for life as a deterrent” to “this should be a learning moment, especially for students.”
  • Several argue research on deterrence suggests certainty of enforcement matters more than harshness of punishment.
  • Others stress punishment also signals community norms and rewards honest reviewers.
  • Clarified that reciprocal reviewers who violated Policy A had their own submissions desk-rejected; innocent authors are not targeted.

LLMs in reviewing: tool vs abuse

  • Some reviewers say they would (or do) use LLMs legitimately: summarizing, flagging issues, improving tone, or checking fairness.
  • Others insist that if you need an LLM to understand a paper, you shouldn’t review it.
  • There is skepticism that anti-LLM policies are sustainable amid rising workloads and paper volume.

Prompt injection & security concerns

  • Several note the irony that enforcement relies on the same prompt-injection vulnerability considered dangerous elsewhere.
  • The lack of separation between “data” and “instructions” in LLM inputs is highlighted as a fundamental security and reliability problem.
  • Commenters worry that authors can embed positive-review instructions in papers themselves, manipulating LLM-assisted reviewers.

Academic incentives & political economy

  • Multiple comments describe ML academia as hyper-competitive, low-trust, and overloaded, with reciprocal reviewing adding unpaid labor.
  • Some see LLM misuse as a predictable outcome of exploitative structures; others respond that reviewing is core professional service and conferences are not-for-profit.