For the most part, the Internet Archive limits its scraping to websites that permit it. The #RobotsExclusionProtocol (AKA #robots.txt) makes it easy for webmasters to tell different kinds of crawlers whether or not they are welcome. If your site has a robots.txt file that tells the Archive's crawler to buzz off, it'll go elsewhere.