Announcing the data.gov archive

Importance of archiving public data

  • Many see the archive as a necessary response to large-scale removal of taxpayer-funded scientific data (CDC, climate, etc.), likened by some to “book burning” and historical erasure.
  • Commenters stress that public datasets are a public good and that deleting them can have serious consequences for research, policy, and accountability.
  • Some highlight specific at‑risk collections (USGS, NOAA, DTIC, NASA TRS) and are starting independent mirrors “just in case.”

Threat model: government hostility and rule of law

  • A major theme is distrust of the current administration’s willingness to follow constitutional constraints; several argue that “rule-of-law mindset” underestimates a “might-makes-right” actor.
  • Others push back, calling fears overblown, noting that climate data was mirrored in the previous Trump term without Gestapo-style crackdowns.
  • There are broader worries about democratic backsliding (elections becoming “managed,” attacks on the FEC, scientists’ speech restrictions, family separations, protest crackdowns).

Can Harvard be compelled to remove the archive?

  • Some speculate about Harvard’s legal protections (private university, large endowment, possible state-level immunity) and argue it could withstand loss of federal funds.
  • Others point out the many levers the federal government has: research grants, federal student aid, tax treatment, indirect pressure on “not really private” universities.
  • Debate over whether strong US free-speech protections meaningfully apply if the government is already ignoring other norms; disagreement over how this compares to European speech/privacy regimes.

Technical and organizational resilience

  • Strong interest in decentralizing storage: torrents, IPFS, Filecoin/DePIN, and geographically distributing copies (especially outside the US).
  • Practical issues: 16 TB is large for individuals, but commenters suggest partial seeding, coordination via torrent piece-availability, and universities/research centers as primary mirrors.
  • Potential attacks on torrents (sock-puppet over-seeding of specific pieces) are noted.
  • Harvard’s project is funded in part by Filecoin-related organizations, and they’ve released open-source “data-vault” tooling plus simple S3-compatible download paths, which commenters see as an implicit invitation to mirror widely.

“Digital militia” analogy

  • Some frame the community of archivists, nonprofits, and hobbyists as a kind of digital analogue to the Second Amendment ideal: civilians as a check on state overreach—not with guns, but with storage and bandwidth.