Search This Blog

Powered by Blogger.

Blog Archive

Labels

Footer About

Footer About

Labels

Showing posts with label Data collection. Show all posts

Spotify Data Scraping Incident Raises Questions on Copyright, Security, and Digital Preservation

 



A large collection of data reportedly taken from Spotify has surfaced online, drawing attention to serious issues around copyright protection, digital security, and large-scale data misuse. The dataset, which is estimated to be close to 300 terabytes in size, is already being distributed through public torrent networks.

The claim comes from Anna’s Archive, a group previously known for archiving books and academic research. According to information shared by the group, it collected metadata for roughly 256 million tracks and audio files for about 86 million songs from Spotify. Anna’s Archive alleges that this archive represents nearly all listening activity on the platform, estimating coverage at around 99.6 percent.

Anna’s Archive has framed the project as a cultural preservation effort. The group argues that while mainstream music is often stored in multiple locations, lesser-known songs are vulnerable to disappearing if streaming platforms remove content, lose licensing agreements, or shut down services. From this perspective, Spotify was described as a practical starting point for documenting modern music history.

The archive is reportedly organised by popularity and shared through bulk torrent files. Anna’s Archive claims that the total size of the collection makes it one of the largest publicly accessible music metadata databases ever assembled.

Details released by the group suggest that highly streamed tracks were stored in their original 160 kbps format, while less popular songs were compressed into smaller files to reduce storage demands. Music released after July 2025 may not be included. At present, full access is limited to metadata, with audio files being released gradually, beginning with the most popular tracks.

Spotify has since issued an updated statement addressing the situation. The company confirmed it identified and disabled the user accounts involved in what it described as unlawful scraping activity. Spotify said it has introduced additional safeguards to prevent similar incidents and is actively monitoring for suspicious behaviour.

The company reiterated its long-standing position against piracy, stating that it works closely with industry partners to protect artists and copyright holders. In an earlier clarification, Spotify explained that the incident did not involve a direct breach of its internal systems. Instead, it said a third party collected public metadata and used illicit methods to bypass digital rights protections in order to access some audio files.

Spotify has not confirmed the scale of the data collection claimed by Anna’s Archive. While the group asserts that almost the entire platform was archived, Spotify has only acknowledged that a portion of its audio content may have been affected.

At this stage, it remains unclear how much of Spotify’s library was actually accessed or whether legal action will be taken to remove the data from torrent networks. Copyright experts note that redistributing licensed music without permission violates copyright laws in many jurisdictions, regardless of whether it is presented as preservation.

Whether the archive can be effectively taken down or contained remains uncertain.

U.S. Startup Launches Mobile Service That Requires No Personal Identification

 



A newly launched U.S. mobile carrier is questioning long-standing telecom practices by offering phone service without requiring customers to submit personal identification. The company, Phreeli, presents itself as a privacy-focused alternative in an industry known for extensive data collection.

Phreeli officially launched in early December and describes its service as being built with privacy at its core. Unlike traditional telecom providers that ask for names, residential addresses, birth dates, and other sensitive information, Phreeli limits its requirements to a ZIP code, a chosen username, and a payment method. According to the company, no customer profiles are created or sold, and user data is not shared for advertising or marketing purposes.

Customers can pay using standard payment cards, or opt for cryptocurrency if they wish to reduce traceable financial links. The service operates entirely on a prepaid basis, with no contracts involved. Monthly plans range from lower-cost options for light usage to higher-priced tiers for customers who require more mobile data. The absence of contracts aligns with the company’s approach, as formal agreements typically require verified personal identities.

Rather than building its own cellular infrastructure, Phreeli operates as a Mobile Virtual Network Operator. This means it provides service by leasing network access from an established carrier, in this case T-Mobile. This model allows Phreeli to offer nationwide coverage without owning physical towers or equipment.

Addressing legal concerns, the company states that U.S. law does not require mobile carriers to collect customer names in order to provide service. To manage billing while preserving anonymity, Phreeli says it uses a system that separates payment information from communication data. This setup relies on cryptographic verification to confirm that accounts are active, without linking call records or data usage to identifiable individuals.

The company’s privacy policy notes that information will only be shared when necessary to operate the service or when legally compelled. By limiting the amount of data collected from the start, Phreeli argues that there is little information available even in the event of legal requests.

Phreeli was founded by Nicholas Merrill, who previously operated an internet service provider and became involved in a prolonged legal dispute after challenging a government demand for user information. That experience reportedly influenced the company’s data-minimization philosophy.

While services that prioritize anonymity are often associated with misuse, Phreeli states that it actively monitors for abusive behavior. Accounts involved in robocalling or scams may face restrictions or suspension.

As concerns grow rampant around digital surveillance and commercial data harvesting, Phreeli’s launch sets the stage for a broader discussion about privacy in everyday communication. Whether this model gains mainstream adoption remains uncertain, but it introduces a notable shift in how mobile services can be structured in the United States.



Connected Car Privacy Risks: How Modern Vehicles Secretly Track and Sell Driver Data

 

The thrill of a smooth drive—the roar of the engine, the grip of the tires, and the comfort of a high-end cabin—often hides a quieter, more unsettling reality. Modern cars are no longer just machines; they’re data-collecting devices on wheels. While you enjoy the luxury and performance, your vehicle’s sensors silently record your weight, listen through cabin microphones, track your every route, and log detailed driving behavior. This constant surveillance has turned cars into one of the most privacy-invasive consumer products ever made. 

The Mozilla Foundation recently reviewed 25 major car brands and declared that modern vehicles are “the worst product category we have ever reviewed for privacy.” Not a single automaker met even basic standards for protecting user data. The organization found that cars collect massive amounts of information—from location and driving patterns to biometric data—often without explicit user consent or transparency about where that data ends up. 

The Federal Trade Commission (FTC) has already taken notice. The agency recently pursued General Motors (GM) and its subsidiary OnStar for collecting and selling drivers’ precise location and behavioral data without obtaining clear consent. Investigations revealed that data from vehicles could be gathered as frequently as every three seconds, offering an extraordinarily detailed picture of a driver’s habits, destinations, and lifestyle. 

That information doesn’t stay within the automaker’s servers. Instead, it’s often shared or sold to data brokers, insurers, and marketing agencies. Driver behavior, acceleration patterns, late-night trips, or frequent stops at specific locations could be used to adjust insurance premiums, evaluate credit risk, or profile consumers in ways few drivers fully understand. 

Inside the car, the illusion of comfort and control masks a network of tracking systems. Voice assistants that adjust your seat or temperature remember your commands. Smartphone apps that unlock the vehicle transmit telemetry data back to corporate servers. Even infotainment systems and microphones quietly collect information that could identify you and your routines. The same technology that powers convenience features also enables invasive data collection at an unprecedented scale. 

For consumers, awareness is the first defense. Before buying a new vehicle, it’s worth asking the dealer what kind of data the car collects and how it’s used. If they cannot answer directly, it’s a strong indication of a lack of transparency. After purchase, disabling unnecessary connectivity or data-sharing features can help protect privacy. Declining participation in “driver score” programs or telematics-based insurance offerings is another step toward reclaiming control. 

As automakers continue to blend luxury with technology, the line between innovation and intrusion grows thinner. Every drive leaves behind a digital footprint that tells a story—where you live, work, shop, and even who rides with you. The true cost of modern convenience isn’t just monetary—it’s the surrender of privacy. The quiet hum of the engine as you pull into your driveway should represent freedom, not another connection to a data-hungry network.

Disney to Pay $10 Million Fine in FTC Settlement Over Child Data Collection on YouTube

 

Disney has agreed to pay millions of dollars in penalties to resolve allegations brought by the Federal Trade Commission (FTC) that it unlawfully collected personal data from young viewers on YouTube without securing parental consent. Federal law under the Children’s Online Privacy Protection Act (COPPA) requires parental approval before companies can gather data from children under the age of 13. 

The case, filed by the U.S. Department of Justice on behalf of the FTC, accused Disney Worldwide Services Inc. and Disney Entertainment Operations LLC of failing to comply with COPPA by not properly labeling Disney videos on YouTube as “Made for Kids.” This mislabeling allegedly allowed the company to collect children’s data for targeted advertising purposes. 

“This case highlights the FTC’s commitment to upholding COPPA, which ensures that parents, not corporations, control how their children’s personal information is used online,” said FTC Chair Andrew N. Ferguson in a statement. 

As part of the settlement, Disney will pay a $10 million civil penalty and implement stricter mechanisms to notify parents and obtain consent before collecting data from underage users. The company will also be required to establish a panel to review how its YouTube content is designated. According to the FTC, these measures are intended to reshape how Disney manages child-directed content on the platform and to encourage the adoption of age verification technologies. 

The complaint explained that Disney opted to designate its content at the channel level rather than individually marking each video as “Made for Kids” or “Not Made for Kids.” This approach allegedly enabled the collection of data from child-directed videos, which YouTube then used for targeted advertising. Disney reportedly received a share of the ad revenue and, in the process, exposed children to age-inappropriate features such as autoplay.  

The FTC noted that YouTube first introduced mandatory labeling requirements for creators, including Disney, in 2019 following an earlier settlement over COPPA violations. Despite these requirements, Disney allegedly continued mislabeling its content, undermining parental safeguards. 

“The order penalizes Disney’s abuse of parental trust and sets a framework for protecting children online through mandated video review and age assurance technology,” Ferguson added. 

The settlement arrives alongside an unrelated investigation launched earlier this year by the Federal Communications Commission (FCC) into alleged hiring practices at Disney and its subsidiary ABC. While separate, the two cases add to the regulatory pressure the entertainment giant is facing. 

The Disney case underscores growing scrutiny of how major media and technology companies handle children’s privacy online, particularly as regulators push for stronger safeguards in digital environments where young audiences are most active.

New Forensic System Tracks Ghost Guns Made With 3D Printing Using SIDE

 

The rapid rise of 3D printing has transformed manufacturing, offering efficient ways to produce tools, spare parts, and even art. But the same technology has also enabled the creation of “ghost guns” — firearms built outside regulated systems and nearly impossible to trace. These weapons have already been linked to crimes, including the 2024 murder of UnitedHealthcare CEO Brian Thompson, sparking concern among policymakers and law enforcement. 

Now, new research suggests that even if such weapons are broken into pieces, investigators may still be able to extract critical identifying details. Researchers from Washington University in St. Louis, led by Netanel Raviv, have developed a system called Secure Information Embedding and Extraction (SIDE). Unlike earlier fingerprinting methods that stored printer IDs, timestamps, or location data directly into printed objects, SIDE is designed to withstand tampering. 

Even if an object is deliberately smashed, the embedded information remains recoverable, giving investigators a powerful forensic tool. The SIDE framework is built on earlier research presented at the 2024 IEEE International Symposium on Information Theory, which introduced techniques for encoding data that could survive partial destruction. This new version adds enhanced security mechanisms, creating a more resilient system that could be integrated into 3D printers. 

The approach does not rely on obvious markings but instead uses loss-tolerant mathematical embedding to hide identifying information within the material itself. As a result, even fragments of plastic or resin may contain enough data to help reconstruct its origin. Such technology could help reduce the spread of ghost guns and make it more difficult for criminals to use 3D printing for illicit purposes. 

However, the system also raises questions about regulation and personal freedom. If fingerprinting becomes mandatory, even hobbyist printers used for harmless projects may be subject to oversight. This balance between improving security and protecting privacy is likely to spark debate as governments consider regulation. The potential uses of SIDE go far beyond weapons tracing. Any object created with a 3D printer could carry an invisible signature, allowing investigators to track timelines, production sources, and usage. 

Combined with artificial intelligence tools for pattern recognition, this could give law enforcement powerful new forensic capabilities. “This work opens up new ways to protect the public from the harmful aspects of 3D printing through a combination of mathematical contributions and new security mechanisms,” said Raviv, assistant professor of computer science and engineering at Washington University. He noted that while SIDE cannot guarantee protection against highly skilled attackers, it significantly raises the technical barriers for criminals seeking to avoid detection.

Federal Judge Allows Amazon Alexa Users’ Privacy Lawsuit to Proceed Nationwide

 

A federal judge in Seattle has ruled that Amazon must face a nationwide lawsuit involving tens of millions of Alexa users. The case alleges that the company improperly recorded and stored private conversations without user consent. U.S. District Judge Robert Lasnik determined that Alexa owners met the legal requirements to pursue collective legal action for damages and an injunction to halt the alleged practices. 

The lawsuit claims Amazon violated Washington state law by failing to disclose that it retained and potentially used voice recordings for commercial purposes. Plaintiffs argue that Alexa was intentionally designed to secretly capture billions of private conversations, not just the voice commands directed at the device. According to their claim, these recordings may have been stored and repurposed without permission, raising serious privacy concerns. Amazon strongly disputes the allegations. 

The company insists that Alexa includes multiple safeguards to prevent accidental activation and denies evidence exists showing it recorded conversations belonging to any of the plaintiffs. Despite Amazon’s defense, Judge Lasnik stated that millions of users may have been impacted in a similar manner, allowing the case to move forward. Plaintiffs are also seeking an order requiring Amazon to delete any recordings and related data it may still hold. The broader issue at stake in this case centers on privacy rights within the home.

If proven, the claims suggest that sensitive conversations could have been intercepted and stored without explicit approval from users. Privacy experts caution that voice data, if mishandled or exposed, can lead to identity risks, unauthorized information sharing, and long-term security threats. Critics further argue that the lawsuit highlights the growing power imbalance between consumers and large technology companies. Amazon has previously faced scrutiny over its corporate practices, including its environmental footprint. 

A 2023 report revealed that the company’s expanding data centers in Virginia would consume more energy than the entire city of Seattle, fueling additional criticism about the company’s long-term sustainability and accountability. The case against Amazon underscores the increasing tension between technological convenience and personal privacy. 

As voice-activated assistants become commonplace in homes, courts will likely play a decisive role in determining the boundaries of data collection and consumer protection. The outcome of this lawsuit could set a precedent for how tech companies handle user data and whether customers can trust that private conversations remain private.

Research Raises Concerns Over How Apple’s Siri and AI System Handle User Data

 



Apple’s artificial intelligence platform, Apple Intelligence, is under the spotlight after new cybersecurity research suggested it may collect and send more user data to company servers than its privacy promises appear to indicate.

The findings were presented this week at the 2025 Black Hat USA conference by Israeli cybersecurity firm Lumia Security. The research examined how Apple’s long-standing voice assistant Siri, now integrated into Apple Intelligence, processes commands, messages, and app interactions.


Sensitive Information Sent Without Clear Need

According to lead researcher Yoav Magid, Siri sometimes transmits data that seems unrelated to the user’s request. For example, when someone asks Siri a basic question such as the day’s weather, the system not only fetches weather information but also scans the device for all weather-related applications and sends that list to Apple’s servers.

The study found that Siri includes location information with every request, even when location is not required for the answer. In addition, metadata about audio content, such as the name of a song, podcast, or video currently playing, can also be sent to Apple without the user having clear visibility into these transfers.


Potential Impact on Encrypted Messaging

One of the most notable concerns came from testing Siri’s dictation feature for apps like WhatsApp. WhatsApp is widely known for offering end-to-end encryption, which is designed to ensure that only the sender and recipient can read a message. However, Magid’s research indicated that when messages are dictated through Siri, the text may be transmitted to Apple’s systems before being delivered to the intended recipient.

This process takes place outside of Apple’s heavily marketed Private Cloud Compute system, the part of Apple Intelligence meant to add stronger privacy protections. It raises questions about whether encrypted services remain fully private when accessed via Siri.


Settings and Restrictions May Not Prevent Transfers

Tests revealed that these data transmissions sometimes occur even when users disable Siri’s learning features for certain apps, or when they attempt to block Siri’s connection to Apple servers. This suggests that some data handling happens automatically, regardless of user preferences.


Different Requests, Different Privacy Paths

Magid also discovered inconsistencies in how similar questions are processed. For example, asking “What’s the weather today?” may send information through Siri’s older infrastructure, while “Ask ChatGPT what’s the weather today?” routes the request through Apple Intelligence’s Private Cloud Compute. Each route follows different privacy rules, leaving users uncertain about how their data is handled.

Apple acknowledged that it reviewed the findings earlier this year. The company later explained that the behavior stems from SiriKit, a framework that allows Siri to work with third-party apps, rather than from Apple Intelligence itself. Apple maintains that its privacy policies already cover these practices and disagrees with the view that they amount to a privacy problem.

Privacy experts say this situation illustrates the growing difficulty of understanding data handling in AI-driven services. As Magid pointed out, with AI integrated into so many modern tools, it is no longer easy for users to tell when AI is at work or exactly what is happening to their information.




OpenAI Launching AI-Powered Web Browser to Rival Chrome, Drive ChatGPT Integration

 

OpenAI is reportedly developing its own web browser, integrating artificial intelligence to offer users a new way to explore the internet. According to sources cited by Reuters, the tool is expected to be unveiled in the coming weeks, although an official release date has not yet been announced. With this move, OpenAI seems to be stepping into the competitive browser space with the goal of challenging Google Chrome’s dominance, while also gaining access to valuable user data that could enhance its AI models and advertising potential. 

The browser is expected to serve as more than just a window to the web—it will likely come packed with AI features, offering users the ability to interact with tools like ChatGPT directly within their browsing sessions. This integration could mean that AI-generated responses, intelligent page summaries, and voice-based search capabilities are no longer separate from web activity but built into the browsing experience itself. Users may be able to complete tasks, ask questions, and retrieve information all within a single, unified interface. 

A major incentive for OpenAI is the access to first-party data. Currently, most of the data that fuels targeted advertising and search engine algorithms is captured by Google through Chrome. By creating its own browser, OpenAI could tap into a similar stream of data—helping to both improve its large language models and create new revenue opportunities through ad placements or subscription services. While details on privacy controls are unclear, such deep integration with AI may raise concerns about data protection and user consent. 

Despite the potential, OpenAI faces stiff competition. Chrome currently holds a dominant share of the global browser market, with nearly 70% of users relying on it for daily web access. OpenAI would need to provide compelling reasons for people to switch—whether through better performance, advanced AI tools, or stronger privacy options. Meanwhile, other companies are racing to enter the same space. Perplexity AI, for instance, recently launched a browser named Comet, giving early adopters a glimpse into what AI-first browsing might look like. 

Ultimately, OpenAI’s browser could mark a turning point in how artificial intelligence intersects with the internet. If it succeeds, users might soon navigate the web in ways that are faster, more intuitive, and increasingly guided by AI. But for now, whether this approach will truly transform online experiences—or simply add another player to the browser wars—remains to be seen.