University of Chicago researchers seek to “poison” AI art generators with Nightshade
Altered images could destroy AI model training efforts that scrape art without consent.
University of Chicago researchers seek to “poison” AI art generators with Nightshade
Altered images could destroy AI model training efforts that scrape art without consent.
@arstechnica If only a technique like this existed for written text. I'd throw salt in everything I still have editorial control over.
@aran @arstechnica I'm curious if that's actually enough to do meaningful damage to an LLM's training as described in the paper, though? Also, for the record, I was not thinking about blog posts but rather code repositories, and I'm not sure you'd end up with compilable code that way. But it's an idea, for sure.
Now that I think of it, any visual discomfort could then be ironed out by using a custom font that, say, just has k instead of Greek kappa.
Yes, this will mess up with text readers for the blind, and has a lower compatibility with older browsers, but conceptually it can work, especially if randomized.
Conceptually, it shouldn't be too hard too hard to randomly mess up text with Unicode equivalents, invisible characters and the likes.
Something like Wordpress could trivially do it with a plug-in.
Ⅼiκe thís
(Roman numeral l, Greek kappa, non-breaking space, zero width space, acute letter i)
https://gist.github.com/StevenACoffman/a5f6f682d94e38ed804182dc2693ed4b
https://github.com/codebox/homoglyph/blob/master/raw_data/chars.txt
@aran @arstechnica agree it's worth investigation for sure
Code would definitely be messed up. 😄
As for the damage, text files are way more minimalist: there's much less data to play with and everything will be parsed directly by the users instead of being a luminance value we barely recognize.
Still, random invisible noise multiplied by all the streamlining and reworking happening in LLMs could lead to fun results. It would definitely be fun to investigate.
@wraptile @arstechnica Could you explain what you mean a little better?
@roadriverrail @arstechnica the way people got captured by copyright propaganda so quickly is trully scary.
@wraptile @arstechnica Ah, now I understand what you were saying. Thanks.
@roadriverrail @arstechnica I'm not even putting a tinfoil hat but 99% of all copyright is owned by mega corporations that are literally in lawsuits with generative AI (like gettyimages) and suddenly there's a trend of "protecting the little guys" from copyright theft?
Generative AI is by far the biggest disruption to the copyright industry. Like nothing ever came as close to breaking it as AI is rn. The whole industry is shitty pants and astroturf is full force. You are are falling for it tbh
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.