2025-05-16

MIT asks arXiv to withdraw preprint of paper on AI and scientific discovery

Apparent Problems with the Study

Several commenters say the data “look fake”: plots are unusually clean, distributions look unnatural, and month‑by‑month breakdowns of scientists’ time are seen as implausible given real-world data noise.
The reported corporate experiment (AI rollout to >1,000 materials scientists in mid‑2022) is viewed as logistically impossible or extremely unlikely: too fast a rollout, too large a lab, and vague technical description of the AI system.
Comparisons are drawn to prior high‑profile social science frauds where the claimed study design and vendor capabilities turned out to be impossible.
Timeline issues are noted: claimed IRB approval and funding details appear inconsistent with when the student was actually at MIT.
Some point to a later attempt to create a spoof corporate website/domain as further evidence of deception.

MIT’s Response and Confidentiality

MIT’s statement says it has “no confidence” in the data or results and asks arXiv to mark the paper withdrawn, but gives no specifics.
Some see this as necessary FERPA‑driven caution: student privacy law prevents releasing key evidence.
Others see opacity and institutional self‑protection: “trust us, it’s bad” without showing the flaws is criticized as arrogant or anti‑scientific.

What to Do with the arXiv Preprint

One camp: arXiv should not remove it; it’s an archival repository, not a quality arbiter. Better to leave it and let journals handle retractions.
Another camp: the paper should be marked withdrawn/retracted but remain accessible, with an explicit notice, to preserve the record and help future readers interpreting citations.
There is confusion between “removal”, “withdrawal”, and “retraction”; some clarify that arXiv withdrawal keeps prior versions accessible with a withdrawal notice.

Responsibility Beyond the Student

Commenters question how a second‑year student’s single‑author paper with dramatic effect sizes got so much institutional and media endorsement without basic plausibility checks (size of the purported lab, realism of the gains).
Some argue senior economists and advisers who publicly championed the work bear responsibility for not checking domain‑specific details.
Others note that science is structurally vulnerable to determined fraudsters: peers and referees rarely have time or mandate to forensically audit data.

Broader Concerns: Fraud, Preprints, and Citations

Several worry that the paper had already accumulated dozens of citations, likely from people who did not read it closely, illustrating how hype can propagate into the literature.
Discussion highlights that peer review is weak at detecting deliberate fraud; preprints exacerbate visibility of unvetted work, but journals also let “schlock” through.
Some suggest impossible or implausible study designs (“this could never have been run as described”) are an underused red-flag heuristic.

Side Threads

Debate over whether frequent use of “I” in a single‑author paper is odd but harmless, versus part of broader academic style conventions.
Long subthread on academic talk quality and filler words (“like”); several note that poor presentation skills are common even at elite institutions and, by itself, not evidence of fraud.

Related topics