Internet Archive: Security breach alert
Incident overview
- Visitors to archive.org saw a JavaScript
alert()popup claiming a “catastrophic security breach” and that “31 million” users were now on Have I Been Pwned (HIBP). - Users quickly confirmed the popup, then the site began returning 503/504 errors and periods of full downtime; later a “Temporarily Offline” page appeared.
- Separate reports describe both a DDoS attack and a data breach affecting about 31M accounts, with the leaked data added to HIBP.
- Thread participants note that media coverage initially blurred “DDoS” vs “breach,” and that final details are still emerging.
Attack vectors and technical details
- Multiple commenters trace the popup to malicious JavaScript served from
polyfill.archive.org, a self‑hosted instance of the polyfill service previously involved in a supply‑chain incident. - This explains the injected
window.alertbut not necessarily how database access was gained; whether these are the same vector is unclear. - A DDoS campaign against IA is credited by an online group; their claimed motives (pro‑Palestinian, anti‑US) are widely questioned, with some suggesting they are opportunistic script‑kiddies or a possible false flag.
- There is concern about linking to a currently compromised site because it could deliver malware.
Data breach impact and security posture
- HIBP domain and email checks confirm many IA users’ addresses are in the leak; BleepingComputer is cited for ~31M records.
- Leaked fields reportedly include emails, usernames, and bcrypt password hashes; no confirmation that payment data was taken, but commenters stress “we don’t know that’s all.”
- Several people find old or changed emails still present in the dump, suggesting historical data was retained or the breach window is earlier than stated.
- Advice repeatedly emphasized: password managers, unique passwords per site, 2FA/MFA “for anything of value,” and ideally unique or aliased email addresses per service.
- There is debate over storing 2FA seeds inside password managers: convenient and better than no 2FA, but it collapses two factors into one vault.
IA design, privacy, and accounts
- Commenters highlight that uploader email addresses are already exposed in item metadata and account XML, viewing this as a longstanding privacy flaw IA has not fixed.
- This sparks discussion about whether email addresses should be treated as private data, and how much linkage between identity and contributions an archive ought to expose.
- Some ask why IA needs user accounts at all; others point out they’re required for uploads and for borrowing digitized books.
Motives, ethics, and community reaction
- Many express anger that a public‑good, donation‑funded “library of the internet” is being attacked at all, likening it to vandalizing a public library.
- Others speculate about enemies of IA (publishers, states) but also warn against ungrounded conspiracies and over‑attribution.
- There’s debate over “hack value”: curiosity‑driven exploits vs destructive DDoS and mass data leaks; several invoke hacker ethics that distinguish making public data accessible from exposing private data.
- One user notes being actively doxxed via archived social media and feeling conflicted: valuing IA while suffering from the permanence of personal PII.
Resilience, decentralization, and backups
- The outage renews worries about IA as a global single point of failure, with frequent “Library of Alexandria” metaphors and calls for multiple independent archives.
- Participants discuss prior/ongoing attempts to mirror or decentralize IA’s corpus (IPFS/Filecoin, torrents, Freenet/Hyphanet), noting scale and filesystem/UX challenges.
- Rough napkin math for mirroring tens of petabytes shows hardware cost is attainable for large organizations but daunting for volunteers; tape and better compression are suggested for cold backups.
- Several propose volunteer‑driven distributed backup schemes using personal storage plus smart redundancy and metadata tracking, though reliability, copyright, and funding remain open problems.