Notices where this attachment appears
-
Embed this notice
Someone was scraping the shit out of FediList (hard enough that the bandwidth was getting et) so I popped in to look at the logs. Nothing interesting: the problem was solved when I killed off Huawei Cloud's IP space. (You are not running Firefox 3.6 on Ubuntu 10 or 4.0b11pre on Windows Server you lazy motherfucker. Update your fake-ass UA strings.) But while I was in there I looked around a little more and apparently OpenAI was scraping it. I thought I'd told them, via robots.txt, to fuck off, so I checked the URL.
Usually if I see a bot and I can't view the URL the bot's operator puts in the UA over Tor, I will just kill the bot. OpenAI won't show you the URL without JavaScript (the "blank white screen" fail), they block mothra, *and* they have apparently blocked my actual IP, because they are giving me 403s.
Letting them redirect you from https://openai.com/searchbot to https://platform.openai.com/docs/bots/ and then run *literally* 6MB of JavaScript, though, will allow you to view the four paragraphs of text (plus a few links and the UA strings) at https://archive.is/cCuWn . This is next-level horseshit, they should ask their bot to write them a thing that puts text on a website GODDAMN.
fuckinghell.png