got rid of all of them. 7800 requests and 3700 requests per second respectively on both servers which is average traffic. good job poast we defeated the chicom enemy
so there's actually way more. 146 /24s and counting so far. somebody really wants this data. thats a fuck of a lot of $$$ for that many IPs. still going about 65k requests/s but im trimming them out
@einmad@pwm they dont hammer fast. it's 10000s of IPs requesting accounts crawling via mentions/reposts 1-2 requests every maybe 30-45 seconds sometimes up to a minute or two. but because there are so many of them it flies under the radar because it doesn't appear as a traffic anomaly
the challenge might be enough. ideally I'd like to redirect traffic to a separate server set up as a tarpit so it keeps the request open forever just to see what it would do
@graf@einmad@pwm Is there any way to split the Nitter and limit IPs to a certain number of requests per hour without an oauth? I don't understand enough about this stuff to know if that's a dumb idea or not.
@graf@pwm can't you just put a small PoW in front of whatever is being scraped? If a session that passed the PoW starts hammering too fast, they get it again.
@graf For a big thing like this it will probably only sorta help, but I find especially for my email server that you stop being low hanging fruit for those not throwing around cash to rent residential proxies
@pwm yeah how im doing this definitely isn't sustainable. i'm evaluating ways i can keep it public but also limit access. i wish it was as easy of just setting up an oauth login for poast users but others use it too. even people not on fedi and i feel bad cutting them off to deal with some shitty people. blocking entire ranges is fine for now but people get caught in the crossfire (and iptables/nginx have limits to the amount of IPs they store 'in memory')
this weekend I'm going to tear everything out as far as blocks and stuff are concerned and do it a little more elegantly. it's a lot of work but hopefully in the long run it'll be manageable enough
@pwm this one actually was also using US residential proxies, a couple US based datacenter ranges.
I wonder if its a mix of servers with /24 (you can rent them for like 150-200$ they call them "SEO servers") and purchased proxies or what but there's a lot of money at play here for that large of a coordinated scrape. 200k requests per second at its peak is crazy. per server
@graf I am never kidding when I tell people this, just drop the entirety of APNIC at the firewall There are 0 (zero) negative consequences. Chinks gooks and abos out