Announcing the data.gov archive
Importance of archiving public data
- Many see the archive as a necessary response to large-scale removal of taxpayer-funded scientific data (CDC, climate, etc.), likened by some to “book burning” and historical erasure.
- Commenters stress that public datasets are a public good and that deleting them can have serious consequences for research, policy, and accountability.
- Some highlight specific at‑risk collections (USGS, NOAA, DTIC, NASA TRS) and are starting independent mirrors “just in case.”
Threat model: government hostility and rule of law
- A major theme is distrust of the current administration’s willingness to follow constitutional constraints; several argue that “rule-of-law mindset” underestimates a “might-makes-right” actor.
- Others push back, calling fears overblown, noting that climate data was mirrored in the previous Trump term without Gestapo-style crackdowns.
- There are broader worries about democratic backsliding (elections becoming “managed,” attacks on the FEC, scientists’ speech restrictions, family separations, protest crackdowns).
Can Harvard be compelled to remove the archive?
- Some speculate about Harvard’s legal protections (private university, large endowment, possible state-level immunity) and argue it could withstand loss of federal funds.
- Others point out the many levers the federal government has: research grants, federal student aid, tax treatment, indirect pressure on “not really private” universities.
- Debate over whether strong US free-speech protections meaningfully apply if the government is already ignoring other norms; disagreement over how this compares to European speech/privacy regimes.
Technical and organizational resilience
- Strong interest in decentralizing storage: torrents, IPFS, Filecoin/DePIN, and geographically distributing copies (especially outside the US).
- Practical issues: 16 TB is large for individuals, but commenters suggest partial seeding, coordination via torrent piece-availability, and universities/research centers as primary mirrors.
- Potential attacks on torrents (sock-puppet over-seeding of specific pieces) are noted.
- Harvard’s project is funded in part by Filecoin-related organizations, and they’ve released open-source “data-vault” tooling plus simple S3-compatible download paths, which commenters see as an implicit invitation to mirror widely.
“Digital militia” analogy
- Some frame the community of archivists, nonprofits, and hobbyists as a kind of digital analogue to the Second Amendment ideal: civilians as a check on state overreach—not with guns, but with storage and bandwidth.