Reproducibility project fails to validate dozens of biomedical studies

Why Brazil and what this project did

  • Commenters note Brazil was chosen because the effort is a coalition of Brazilian labs focusing on domestic research, not as a judgment on that country in particular.
  • The Nature piece is framed as consistent with prior large-scale replication efforts, but distinct in focusing on a single country and specific methods.
  • Some highlight that many replicated studies had tiny sample sizes (median n≈5), arguing this makes low replication rates statistically unsurprising and limits how much can be inferred.

What “failed replication” does and doesn’t mean

  • Multiple comments stress that non-replication ≠ fraud: it can reflect noise, unreported but crucial protocol details, differences in lab conditions, or stochastic phenomena.
  • Others push back that if studies don’t reproduce at stated confidence levels, that still undermines their value as science, especially when downstream work and clinical practice rely on them.
  • There’s debate over how “consequential” failures are: is the non-replicable work obscure, or has it misdirected whole fields (e.g., Alzheimer’s/amyloid)?

Incentives, fraud, and systemic pressure

  • Many blame “publish or perish,” H-index chasing, and grant-driven metrics for encouraging corner-cutting, p‑hacking, and selectively reported results.
  • Some argue most problems are naive methodology and noisy systems rather than deliberate fraud; others claim outright fraud and misconduct are far more common and under-punished.
  • There’s disagreement whether scientists are mostly intrinsically motivated (so incentives don’t dominate) or behave like any other profession under strong external pressures.

Proposed fixes

  • Pre-registration of hypotheses, analysis plans, and endpoints is widely endorsed to reduce p‑hacking and to ensure null results get published. Concerns: it can constrain exploratory work and make research “boring” or bureaucratic.
  • Dedicated funding and career tracks for replication work, possibly with independent teams and partial grant money reserved for follow-up replications.
  • Adjusting metrics: “reproducible” h-indices, recognition for null results, and badging/verification of code and data; automatic version control for analyses.
  • Structural ideas: making replication central to PhD or early-stage training, or creating formal/private “trust networks” where scientists rate labs and papers.

Broader impacts and other fields

  • Commenters connect this to declining public trust and “post-truth” dynamics, aggravated by media overhyping fragile results.
  • Several note parallel reproducibility issues in psychology, nutrition, and computer science/ML, especially where code, data, or hardware are unavailable.