Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://nya.social/notes/818c3d1bdb3e20788eb08e25">นาตาลี :bellsystem: (natalie@nya.social)'s status on Wednesday, 06-Dec-2023 12:42:01 JST</a><a href="https://nya.social/@natalie" title="natalie@nya.social"><img src="https://gnusocial.jp/avatar/3687-48-20220807192020.webp" width="48" height="48" alt="นาตาลี :bellsystem:" style="position: absolute; left: 0; top: 0;">นาตาลี :bellsystem:</a><div><ul><li></ul></div></section><article><p>there is currently a bot inside MIT IP space, address 18[.]4[.]38[.]176, scanning fedi at large. i have confirmed this with 5+ unrelated instance admins, large and small instances, across mastodon/misskey/pleroma/akkoma.<br><br>the bot is poorly behaved. i have observed it making repeated requests, multiple times per second, for the exact same paths (the paths being, generally: user profiles, specific posts, and sometimes following links in posts). returning 403s does not stop this activity. one of my domains received hundreds of additional requests despite replying with 403 to all of them. i have also seen it make requests for paths containing html tags - seems like a badly written parser. the purpose of these requests and what data is being gathered is unclear.<br><br>PTR on the ip returns sts-drand03.mit.edu. a quick web search for "mit drand" brings back <a href="https://mitsloan.mit.edu/faculty/directory/david-g-rand">https://mitsloan.mit.edu/faculty/directory/david-g-rand</a> and his personal website: <a href="https://davidrand-cooperation.com/">https://davidrand-cooperation.com/</a> (note: other IPs in the /24 also have names in the PTR which match up with names of MIT faculty, but only the .176 IP appears to be involved in this activity).<br>seems he's doing research into "misinformation" and "fake news" on social media. he also appears to be on fedi! so <a href="https://techhub.social/@Drand">@Drand@techhub.social</a>, given this activity is sourced from an IP with your name on it, could you share the purpose of this traffic? what data is being collected and how is it being used? do you plan to respect robots.txt or identify yourself in your useragent? is there a process for instance admins to opt out of this activity other than blocking the source IP?</p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/2418733#notice-4780692">In conversation</a><time datetime="2023-12-06T12:42:01+09:00" title="Wednesday, 06-Dec-2023 12:42:01 JST">Wednesday, 06-Dec-2023 12:42:01 JST</time> <span>from <span><a href="https://nya.social/notes/818c3d1bdb3e20788eb08e25" rel="external" title="Sent from nya.social via ActivityPub">nya.social</a></span></span><a href="https://nya.social/notes/818c3d1bdb3e20788eb08e25">permalink</a><h4>Attachments</h4><ol><li><label><a rel="external" href="https://gnusocial.jp/attachment/1926960">Untitled attachment</a></label><br></li><li><article><header><div>Domain not in remote thumbnail source whitelist: mitsloan.mit.edu</div><h5><a href="https://mitsloan.mit.edu/faculty/directory/david-g-rand">David G. Rand | MIT Sloan</a></h5><div></div></header><div></div><footer></footer></article></li><li><article><header><div>No result found on File_thumbnail lookup.</div><h5><a href="https://davidrand-cooperation.com/">David Rand</a></h5><div></div></header><div></div><footer></footer></article></li></ol></footer></blockquote>

Corresponding Notice

Embed this notice
นาตาลี :bellsystem: (natalie@nya.social)'s status on Wednesday, 06-Dec-2023 12:42:01 JSTนาตาลี :bellsystem:
- Dave Rand
there is currently a bot inside MIT IP space, address 18[.]4[.]38[.]176, scanning fedi at large. i have confirmed this with 5+ unrelated instance admins, large and small instances, across mastodon/misskey/pleroma/akkoma.

the bot is poorly behaved. i have observed it making repeated requests, multiple times per second, for the exact same paths (the paths being, generally: user profiles, specific posts, and sometimes following links in posts). returning 403s does not stop this activity. one of my domains received hundreds of additional requests despite replying with 403 to all of them. i have also seen it make requests for paths containing html tags - seems like a badly written parser. the purpose of these requests and what data is being gathered is unclear.

PTR on the ip returns sts-drand03.mit.edu. a quick web search for "mit drand" brings back https://mitsloan.mit.edu/faculty/directory/david-g-rand and his personal website: https://davidrand-cooperation.com/ (note: other IPs in the /24 also have names in the PTR which match up with names of MIT faculty, but only the .176 IP appears to be involved in this activity).
seems he's doing research into "misinformation" and "fake news" on social media. he also appears to be on fedi! so @Drand@techhub.social, given this activity is sourced from an IP with your name on it, could you share the purpose of this traffic? what data is being collected and how is it being used? do you plan to respect robots.txt or identify yourself in your useragent? is there a process for instance admins to opt out of this activity other than blocking the source IP?
In conversationWednesday, 06-Dec-2023 12:42:01 JST from nya.socialpermalink
Attachments
1. Untitled attachment
2. Domain not in remote thumbnail source whitelist: mitsloan.mit.edu
  David G. Rand | MIT Sloan
3. No result found on File_thumbnail lookup.
  David Rand

Public

Embed Notice

HTML Code

Corresponding Notice