my mental model of the scraping load on most of the public web is basically:
people want to train models, want them trained on the latest stuff, and there is money for doing this. there is little incentive to be efficient or responsible about it, so we wind up with a bunch of crawlers just absolutely going to town.
do you actually work in this field and know better than me? am i missing something important?