A large collection of data reportedly taken from Spotify has surfaced online, drawing attention to serious issues around copyright protection, digital security, and large-scale data misuse. The dataset, which is estimated to be close to 300 terabytes in size, is already being distributed through public torrent networks.
The claim comes from Anna’s Archive, a group previously known for archiving books and academic research. According to information shared by the group, it collected metadata for roughly 256 million tracks and audio files for about 86 million songs from Spotify. Anna’s Archive alleges that this archive represents nearly all listening activity on the platform, estimating coverage at around 99.6 percent.
Anna’s Archive has framed the project as a cultural preservation effort. The group argues that while mainstream music is often stored in multiple locations, lesser-known songs are vulnerable to disappearing if streaming platforms remove content, lose licensing agreements, or shut down services. From this perspective, Spotify was described as a practical starting point for documenting modern music history.
The archive is reportedly organised by popularity and shared through bulk torrent files. Anna’s Archive claims that the total size of the collection makes it one of the largest publicly accessible music metadata databases ever assembled.
Details released by the group suggest that highly streamed tracks were stored in their original 160 kbps format, while less popular songs were compressed into smaller files to reduce storage demands. Music released after July 2025 may not be included. At present, full access is limited to metadata, with audio files being released gradually, beginning with the most popular tracks.
Spotify has since issued an updated statement addressing the situation. The company confirmed it identified and disabled the user accounts involved in what it described as unlawful scraping activity. Spotify said it has introduced additional safeguards to prevent similar incidents and is actively monitoring for suspicious behaviour.
The company reiterated its long-standing position against piracy, stating that it works closely with industry partners to protect artists and copyright holders. In an earlier clarification, Spotify explained that the incident did not involve a direct breach of its internal systems. Instead, it said a third party collected public metadata and used illicit methods to bypass digital rights protections in order to access some audio files.
Spotify has not confirmed the scale of the data collection claimed by Anna’s Archive. While the group asserts that almost the entire platform was archived, Spotify has only acknowledged that a portion of its audio content may have been affected.
At this stage, it remains unclear how much of Spotify’s library was actually accessed or whether legal action will be taken to remove the data from torrent networks. Copyright experts note that redistributing licensed music without permission violates copyright laws in many jurisdictions, regardless of whether it is presented as preservation.
Whether the archive can be effectively taken down or contained remains uncertain.