2024-09-24

Why Most Published Research Findings Are False (2005)

Scope and intent of the paper

Many commenters see the paper as a clear, accessible synthesis of long‑standing critiques of p‑values, low power, and biased study designs.
Others argue it overgeneralizes from medical simulations to “all research,” noting that fields differ markedly in methods and error profiles.
Some follow‑up work (cited in the thread) suggests the original quantitative claims (e.g., proportion false) were overstated, but directionally right.

Replication crisis and field differences

Strong consensus that replication problems are worst in medicine, psychology, social sciences, ecology/climatology, and some life sciences.
Physical sciences, some areas of chemistry/biochemistry, and engineering are seen as more robust, partly because experiments can be repeated many times with high signal‑to‑noise.
Computer science is split: some see widespread non‑reproducibility (e.g., sensitivity to random seeds, missing methods/code), others stress that CS results are more abstract and not meant as drop‑in industrial solutions.

Peer review, media, and public trust

Peer review is described as a weak filter: more a “stamp” than verification or replication.
Science journalism is heavily criticized for hyping single studies, omitting key details, and fostering the impression that each paper is a fact.
This feeds public confusion (“science changes every week”) and politicized sloganizing (“trust the science”) despite fragile underlying evidence.
Several argue decisions should rest on replicated results, meta‑analyses, and convergence of evidence, not single papers.

Non‑replication, “truth,” and usefulness

Some emphasize that non‑replication often signals “failure to generalize” rather than outright falsity, especially in heterogeneous human/medical contexts.
Others counter that, given most hypotheses are false a priori, repeated non‑replication should strongly increase confidence in the null.
Key point: even nominally true but irreproducible results are a poor foundation for further work; science needs stable, usable building blocks.

Incentives, misconduct, and systemic issues

“Publish or perish,” prestige chasing, and funding rules incentivize p‑hacking, overinterpretation, and omission of crucial methods.
Fraud is viewed as rare but impactful; incompetence and sloppy practice are seen as far more common.
Commenters note paper mills, retractions, and national/institutional pressures, but disagree on how pervasive outright fraud is.

Proposed reforms and alternatives

Suggestions include: valuing replication, preregistration, better data management and sharing, publishing under pseudonyms, open code, meta‑science tools, and new publishing platforms.
Some argue science is a noisy but convergent process (like gradient descent) that moves toward truth over generations; others worry it can get trapped in local minima due to politics, funding, and entrenched assumptions.

Related topics