@corbet @johnefrancis @LWN
Struggling with likely the same bots over here. I deployed a similar tarpit* on a large-ish site a few days ago - taking care not to trap the good bots - but can't say it's been very successful. It might have taken some load off of the main site, but not nearly enough to make a difference.
One more thing I'm considering is prefixing all internal links with a '/botcheck/' path for potentially suspicious visitors, set a cookie on that page and strip that prefix with JS. If the cookie is set on the /botcheck/ endpoint, redirect to the proper page, otherwise tarpit them. This way the site would still work as long as the user has *either* JS or cookies enabled. Still not perfect, but slightly less invasive than most common active defenses.