MIT asks arXiv to withdraw preprint of paper on AI and scientific discovery

Apparent Problems with the Study

  • Several commenters say the data “look fake”: plots are unusually clean, distributions look unnatural, and month‑by‑month breakdowns of scientists’ time are seen as implausible given real-world data noise.
  • The reported corporate experiment (AI rollout to >1,000 materials scientists in mid‑2022) is viewed as logistically impossible or extremely unlikely: too fast a rollout, too large a lab, and vague technical description of the AI system.
  • Comparisons are drawn to prior high‑profile social science frauds where the claimed study design and vendor capabilities turned out to be impossible.
  • Timeline issues are noted: claimed IRB approval and funding details appear inconsistent with when the student was actually at MIT.
  • Some point to a later attempt to create a spoof corporate website/domain as further evidence of deception.

MIT’s Response and Confidentiality

  • MIT’s statement says it has “no confidence” in the data or results and asks arXiv to mark the paper withdrawn, but gives no specifics.
  • Some see this as necessary FERPA‑driven caution: student privacy law prevents releasing key evidence.
  • Others see opacity and institutional self‑protection: “trust us, it’s bad” without showing the flaws is criticized as arrogant or anti‑scientific.

What to Do with the arXiv Preprint

  • One camp: arXiv should not remove it; it’s an archival repository, not a quality arbiter. Better to leave it and let journals handle retractions.
  • Another camp: the paper should be marked withdrawn/retracted but remain accessible, with an explicit notice, to preserve the record and help future readers interpreting citations.
  • There is confusion between “removal”, “withdrawal”, and “retraction”; some clarify that arXiv withdrawal keeps prior versions accessible with a withdrawal notice.

Responsibility Beyond the Student

  • Commenters question how a second‑year student’s single‑author paper with dramatic effect sizes got so much institutional and media endorsement without basic plausibility checks (size of the purported lab, realism of the gains).
  • Some argue senior economists and advisers who publicly championed the work bear responsibility for not checking domain‑specific details.
  • Others note that science is structurally vulnerable to determined fraudsters: peers and referees rarely have time or mandate to forensically audit data.

Broader Concerns: Fraud, Preprints, and Citations

  • Several worry that the paper had already accumulated dozens of citations, likely from people who did not read it closely, illustrating how hype can propagate into the literature.
  • Discussion highlights that peer review is weak at detecting deliberate fraud; preprints exacerbate visibility of unvetted work, but journals also let “schlock” through.
  • Some suggest impossible or implausible study designs (“this could never have been run as described”) are an underused red-flag heuristic.

Side Threads

  • Debate over whether frequent use of “I” in a single‑author paper is odd but harmless, versus part of broader academic style conventions.
  • Long subthread on academic talk quality and filler words (“like”); several note that poor presentation skills are common even at elite institutions and, by itself, not evidence of fraud.