March 27, 2024
Nightshade's Data Poisoning: A New Weapon Against AI Intellectual Property Theft

Nightshade’s Data Poisoning: A New Weapon Against AI Intellectual Property Theft

Given that generative AI models are built on vast libraries of previously created artwork, artists are scrambling to find ways to stop their creations from being appropriated without their consent, despite the technology’s amazing capacity to produce visual imagery becoming more and more accessible. The solution could lie in a new device with the menacing moniker Nightshade.

The technique is to use “data poisoning attacks” that are tailored to each request, corrupting the data that is put into an image generator to train AI models.

“Poisoning has been a known attack vector in machine learning models for years,” Professor Ben Zhao stated. “Nightshade is not interesting because it does poisoning, but because it poisons generative AI models, which nobody thought was possible because these models are so big.”

With the mainstreaming of generative AI models this year, it has become imperative to combat intellectual property theft and AI deepfakes. Similar suggestions were made in July by a group of academics at MIT, who proposed inserting tiny pieces of code that would cause the image to warp and become useless.

The term “generative AI” describes AI models that produce text, photos, music, or videos in response to stimuli. Microsoft, Google, Amazon, and Meta have all made significant investments to provide customers with access to generative AI technologies.

According to Zhao, Nightshade targets the prompt—for instance, requests to make an image of a dragon, dog, or horse—to circumvent the issue of an AI model’s vast datasets.

“Attacking the whole model makes no sense,” Zhao stated. “What you do want to attack is individual prompts, debilitating the model and disabling it from generating art.”

The study team clarified that to accomplish the desired result, the text and pictures inside the tainted data have to be carefully designed to look genuine and to trick both automatic alignment detectors and human inspectors to evade discovery.

Zhao stated that mislabeling a few hundred photographs of cats as dogs is the simplest method to trick an AI model like Stable Diffusion into believing a cat is a dog, even though the toxic Nightshade dataset is only a proof of concept.

Artists might start using these poison pills widely, even in the absence of coordination, which could lead to the AI model collapsing.

“Once enough attacks become active on the same model, the model becomes worthless,” Zhao explained. “By worthless, I mean, you give it things like ‘give me a painting,’ and it comes out with what looks like a kaleidoscope of pixels. The model is effectively dumbed down to the version of something akin to a random pixel generator.”

Zhao said that Nightshade only becomes active when an AI model tries to consume the data that it has incorporated and does not need any action to be performed against the AI image generator itself.

He described it as less of an assault and more of a barbed wire fence with poison points directed at AI developers who do not follow opt-out requests and do-not-scrape directions, saying “It does nothing to them unless they take those images and put them into the training data.”

“This is designed to solve that problem. So, we had this barbed wire sting with some poison. Unless you run around and get this stuff all over, you won’t suffer.” Zhao concluded.


Related posts

Ghostwriter’s ‘Heart on My Sleeve’ Eyes Grammy Nods Amid AI Music Controversy

Chloe Taylor

AstraZeneca Collaborates with AI Firm to Discover Cancer Cure

Bran Lopez

Figure AI Hits $2.6B Valuation with Backing from Bezos, OpenAI

Henry Clarke

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More