Search This Blog

Powered by Blogger.

Blog Archive

Labels

Showing posts with label Diffusion Models. Show all posts

AI Models Produces Photos of Real People and Copyrighted Images


The infamous image generation models are used in order to produce identifiable photos of actual people. This leads to the privacy infringement of numerous individuals, as per a new research. 

The study demonstrates how these AI systems can be programmed to reproduce precisely copyrighted artwork and medical images. It is a result that might help artists who are suing AI companies for copyright violations.  

Research: Extracting Training Data from Diffusion Models 

Researchers from Google, DeepMind, UC Berkeley, ETH Zürich, and Princeton obtained their findings by repeatedly prompting Google’s Imagen with image captions, like the user’s name. Following this, they analyzed if any of the images they produced matched the original photos stored in the model's database. The team was successful in extracting more than 100 copies of photos from the AI's training set. 

These image-generating AI models are apparently produced over vast data sets, that consist of images with captions that have been taken from the internet. The most recent technology works by taking images in the data sets and altering pixels individually until the original image is nothing more than a jumble of random pixels. The AI model then reverses the procedure to create a new image from the pixelated mess. 

According to Ryan Webster, a Ph.D. student from the University of Caen Normandy, who has studied privacy in other image generation models but is not involved in the research, the study is the first to demonstrate that these AI models remember photos from their training sets. This could also serve as an implication for startups wanting to use AI models in health care since it indicates that these systems risk leaking users’ private and sensitive data. 

Eric Wallace, a Ph.D. scholar who was involved in the study group, raises concerns over the privacy issue and says they hope to raise alarm regarding the potential privacy concerns with these AI models before they are extensively implemented in delicate industries like medicine. 

“A lot of people are tempted to try to apply these types of generative approaches to sensitive data, and our work is definitely a cautionary tale that that’s probably a bad idea unless there’s some kind of extreme safeguards taken to prevent [privacy infringements],” Wallace says. 

Another major conflict between AI businesses and artists is caused by the extent to which these AI models memorize and regurgitate photos from their databases. Two lawsuits have been filed against AI by Getty Images and a group of artists who claim the company illicitly scraped and processed their copyrighted content. 

The researchers' findings will ultimately aid artists to claim that AI companies have violated their copyright. The companies may have to pay artists whose work was used to train Stable Diffusion if they can demonstrate that the model stole their work without their consent. 

According to Sameer Singh, an associate professor of computer science at the University of California, Irvine, these findings hold paramount importance. “It is important for general public awareness and to initiate discussions around the security and privacy of these large models,” he adds.