Backing up Spotify

Legality and Terms of Service

  • Many commenters state the project is clearly illegal copyright infringement for the audio files; some distinguish “piracy” from “theft” but still call it unlawful.
  • Debate over whether scraping violates law or only Spotify’s ToS. One lawyer notes ToS are contract law and usually require actual assent; criminal vs civil liability are separated.
  • Metadata-only release is seen as much safer; text metadata itself might be legal, though large‑scale scraping could still breach contracts and anti‑hacking statutes depending on jurisdiction.
  • Some argue research use might fall under fair use (especially in the US), but that doesn’t protect redistribution.
  • Jurisdiction matters: Anna’s Archive is believed to operate from Russia or similar jurisdictions, complicating enforcement but not preventing DNS/IP blocking, which is already happening in several EU countries.

Ethics, Artists, and “Stealing”

  • One side: ripping and releasing Spotify’s catalog is framed as “stealing” from thousands of artists, many already poorly paid; enabling others to resell or stream without licenses is seen as clearly harmful.
  • Counter‑side: copying doesn’t deprive the rights holder of their copy; main harm is hypothetical lost sales, which pirates might never have paid anyway. Many argue streaming pays artists “peanuts” and labels capture most revenue.
  • Several musicians say streaming income is negligible; real money comes from touring, merch, direct sales (Bandcamp, CDs, vinyl). For them, large‑scale piracy is more about exposure than lost income.
  • Preservationists stress that streaming platforms routinely remove works (rights changes, regional exits, politics), creating “contemporary lost media.” They view this archive as cultural insurance for future generations.
  • Others are uneasy: they support preservation but fear this scale and visibility will draw aggressive music‑industry litigation and jeopardize Anna’s Archive’s book collections.

Spotify Critique

  • Frequent reminders that Spotify itself reportedly bootstrapped with unlicensed catalogs; some see current outrage as hypocritical.
  • Complaints about tiny per‑stream payouts, new minimum‑stream thresholds, label capture of revenue, and Spotify’s push of low‑royalty “garbage”/AI content and house-brand tracks.
  • Users report songs and even whole catalogs disappearing, greyed‑out tracks, region restrictions, worsening recommendations, and general “enshittification.”
  • Others defend Spotify as reasonably priced and convenient given storage, bandwidth, and catalog breadth.

Technical Aspects of the Rip

  • Discussion of how 300 TB could be exfiltrated: many parallel accounts, continuous streaming/download at 160 kbps (free tier), or direct use of open‑source clients (librespot) and possibly DRM cracks (Widevine, “playplay”).
  • Some speculate about insider access or leaked credentials but nothing concrete is known.
  • Questions about Spotify’s rate‑limiting and why it didn’t prevent this; suggested that such traffic may look like heavy but plausible listening.
  • Torrent distribution: users note BitTorrent supports selective downloading; a “Popcorn Time for music” UI that streams directly from these torrents is considered technically straightforward, if blatantly illegal.

Value of Metadata and Research Uses

  • The metadata dump (hundreds of millions of tracks, ISRCs, genres, keys, tempos, popularity scores) is widely seen as a goldmine for:
    • Music information retrieval, recommendation, classification, and search benchmarking.
    • Studying long‑tail listening behavior: a large majority of tracks have under 1,000 streams.
    • Genre and key distributions (e.g., unexpected prevalence of Db/C#; large counts for opera and psytrance raise questions about classification quality or auto‑generated content).
  • Some want this ingested into projects like MusicBrainz/EveryNoise or wrapped in an open API; others mention building search front‑ends and using it as IR benchmark data.
  • There’s interest in using the metadata to detect AI‑generated “slop” and mislabeled content.

AI Training and “Slop” Concerns

  • Many expect big tech and AI labs to be early heavy users; Anna’s Archive already advertises paid bulk access for AI training.
  • Critics see this as fueling even more low‑effort generative music and undermining already precarious human creators.
  • Supporters reply that AI companies already scrape or license massive catalogs; this archive marginally changes access but greatly helps independent researchers.

User Behavior, Alternatives, and Blocking

  • Several note average listeners are unlikely to handle 300 TB torrents; piracy’s real draw is cheap, polished interfaces, not raw files.
  • Others describe existing consumer‑friendly piracy boxes for video as precedent, and foresee similar tools for this music set.
  • Many advocate supporting artists directly (Bandcamp, shows, merch) while self‑hosting personal libraries (Jellyfin/Navidrome/Lidarr) and using tools to back up Spotify playlists.
  • Access to Anna’s Archive is already DNS‑blocked in parts of Germany, Belgium, the Netherlands and elsewhere; users bypass via VPNs or custom DNS, and criticize private “copyright clearing” bodies driving such blocks.