2025-12-20

Backing up Spotify

Legality and Terms of Service

Many commenters state the project is clearly illegal copyright infringement for the audio files; some distinguish “piracy” from “theft” but still call it unlawful.
Debate over whether scraping violates law or only Spotify’s ToS. One lawyer notes ToS are contract law and usually require actual assent; criminal vs civil liability are separated.
Metadata-only release is seen as much safer; text metadata itself might be legal, though large‑scale scraping could still breach contracts and anti‑hacking statutes depending on jurisdiction.
Some argue research use might fall under fair use (especially in the US), but that doesn’t protect redistribution.
Jurisdiction matters: Anna’s Archive is believed to operate from Russia or similar jurisdictions, complicating enforcement but not preventing DNS/IP blocking, which is already happening in several EU countries.

Ethics, Artists, and “Stealing”

One side: ripping and releasing Spotify’s catalog is framed as “stealing” from thousands of artists, many already poorly paid; enabling others to resell or stream without licenses is seen as clearly harmful.
Counter‑side: copying doesn’t deprive the rights holder of their copy; main harm is hypothetical lost sales, which pirates might never have paid anyway. Many argue streaming pays artists “peanuts” and labels capture most revenue.
Several musicians say streaming income is negligible; real money comes from touring, merch, direct sales (Bandcamp, CDs, vinyl). For them, large‑scale piracy is more about exposure than lost income.
Preservationists stress that streaming platforms routinely remove works (rights changes, regional exits, politics), creating “contemporary lost media.” They view this archive as cultural insurance for future generations.
Others are uneasy: they support preservation but fear this scale and visibility will draw aggressive music‑industry litigation and jeopardize Anna’s Archive’s book collections.

Spotify Critique

Frequent reminders that Spotify itself reportedly bootstrapped with unlicensed catalogs; some see current outrage as hypocritical.
Complaints about tiny per‑stream payouts, new minimum‑stream thresholds, label capture of revenue, and Spotify’s push of low‑royalty “garbage”/AI content and house-brand tracks.
Users report songs and even whole catalogs disappearing, greyed‑out tracks, region restrictions, worsening recommendations, and general “enshittification.”
Others defend Spotify as reasonably priced and convenient given storage, bandwidth, and catalog breadth.

Technical Aspects of the Rip

Discussion of how 300 TB could be exfiltrated: many parallel accounts, continuous streaming/download at 160 kbps (free tier), or direct use of open‑source clients (librespot) and possibly DRM cracks (Widevine, “playplay”).
Some speculate about insider access or leaked credentials but nothing concrete is known.
Questions about Spotify’s rate‑limiting and why it didn’t prevent this; suggested that such traffic may look like heavy but plausible listening.
Torrent distribution: users note BitTorrent supports selective downloading; a “Popcorn Time for music” UI that streams directly from these torrents is considered technically straightforward, if blatantly illegal.

Value of Metadata and Research Uses

The metadata dump (hundreds of millions of tracks, ISRCs, genres, keys, tempos, popularity scores) is widely seen as a goldmine for:
- Music information retrieval, recommendation, classification, and search benchmarking.
- Studying long‑tail listening behavior: a large majority of tracks have under 1,000 streams.
- Genre and key distributions (e.g., unexpected prevalence of Db/C#; large counts for opera and psytrance raise questions about classification quality or auto‑generated content).
Some want this ingested into projects like MusicBrainz/EveryNoise or wrapped in an open API; others mention building search front‑ends and using it as IR benchmark data.
There’s interest in using the metadata to detect AI‑generated “slop” and mislabeled content.

AI Training and “Slop” Concerns

Many expect big tech and AI labs to be early heavy users; Anna’s Archive already advertises paid bulk access for AI training.
Critics see this as fueling even more low‑effort generative music and undermining already precarious human creators.
Supporters reply that AI companies already scrape or license massive catalogs; this archive marginally changes access but greatly helps independent researchers.

User Behavior, Alternatives, and Blocking

Several note average listeners are unlikely to handle 300 TB torrents; piracy’s real draw is cheap, polished interfaces, not raw files.
Others describe existing consumer‑friendly piracy boxes for video as precedent, and foresee similar tools for this music set.
Many advocate supporting artists directly (Bandcamp, shows, merch) while self‑hosting personal libraries (Jellyfin/Navidrome/Lidarr) and using tools to back up Spotify playlists.
Access to Anna’s Archive is already DNS‑blocked in parts of Germany, Belgium, the Netherlands and elsewhere; users bypass via VPNs or custom DNS, and criticize private “copyright clearing” bodies driving such blocks.

Related topics