Confirmed it myself: ChatGPT is crawling the fediverse, even servers like aoir.social whose policies prohibit crawling and scraping of data. cc @admin1 @nik @ubiquity75 @rwg @tstruett @paufder ht @atomicpoet for bringing this to my attention
Conversation
Notices
-
Embed this notice
Aram Sinnreich (aram@aoir.social)'s status on Sunday, 12-Jan-2025 06:40:33 JST Aram Sinnreich - kuteboiCoder likes this.
- nyanide :nyancat_rainbow::nyancat_body::nyancat_face: repeated this.
-
Embed this notice
Kevin Freitas (kevinfreitas@mastodon.social)'s status on Sunday, 12-Jan-2025 09:31:40 JST Kevin Freitas @mr_southgate It looks at what’s called the user agent string that every browser or bot carries with it when browsing the web. I look for terms that are unique to known AI bots and scramble things for them. Nothing guarantees an AI company will be honest with how they label their traffic but I figure it can’t hurt (us). Cheers!
-
Embed this notice
Kevin Freitas (kevinfreitas@mastodon.social)'s status on Sunday, 12-Jan-2025 09:31:43 JST Kevin Freitas @nik @rwg @Andres4NY @tstruett @admin1 @paufder @aram @ubiquity75 If anyone has a #WordPress site I have a prototype plugin that garbles all the text on a site for #AI bots:
https://kevinfreitas.net/tools-experiments/
We need to flood their theft with garbage.
-
Embed this notice
Michael Southgate (mr_southgate@mastodon.social)'s status on Sunday, 12-Jan-2025 09:31:43 JST Michael Southgate @KevinFreitas How does this work? Not a tech guy so just curious as to how anyone can fool scrapers while giving humans and screen readers the real deal.
Minoru Saba repeated this. -
Embed this notice
Andres Salomon (andres4ny@social.ridetrans.it)'s status on Sunday, 12-Jan-2025 09:31:44 JST Andres Salomon @rwg @aram @admin1 @nik @ubiquity75 @tstruett @paufder We need to poison their data collection. Fill servers with bots full of autogenerated text, for example..
Minoru Saba repeated this. -
Embed this notice
Robert W. Gehl (rwg@aoir.social)'s status on Sunday, 12-Jan-2025 09:31:45 JST Robert W. Gehl @aram @admin1 @nik @ubiquity75 @tstruett @paufder
Sigh. Cue up those who would make the disingenuous "if you didn't want this, don't post to the Internet!" argument.
We need state-based regulation of this theft of our sociality ASAP. Whether it's true opt-in, or more robust IP protections for regular people.
-
Embed this notice
Paul Sutton (zleap@qoto.org)'s status on Sunday, 12-Jan-2025 16:38:57 JST Paul Sutton @aram @admin1 @nik @ubiquity75 @rwg @tstruett @paufder @atomicpoet
I tried to ask a similar question about @QOTO, and it could not find anything, This issue probably needs further investigation to see the extent of scraping.
-
Embed this notice
þernia (pernia@cum.salon)'s status on Monday, 13-Jan-2025 10:08:58 JST þernia @aram @admin1 @nik @ubiquity75 @rwg @tstruett @paufder amazing. too bad they probably FILTER the salon, fucking cucks