Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://eldritch.cafe/users/alyx/statuses/113151735456280663">Alyx (alyx@eldritch.cafe)'s status on Tuesday, 17-Sep-2024 17:40:34 JST</a><a href="https://eldritch.cafe/@alyx" title="alyx@eldritch.cafe"><img src="https://gnusocial.jp/avatar/282516-48-20240917084032.webp" width="48" height="48" alt="Alyx" style="position: absolute; left: 0; top: 0;">Alyx</a><div><ul><li><li><a href="https://gnusocial.jp/user/258982" title="aumetra@corteximplant.net">Aumetra Ⓐ :hex_non_binary: :good_boy:</a></li></ul></div></section><article><p>So, I fell down a rabbit hole. I learned that Mastodon can "DDoS" a server as thousands of instances fetch the metadata of a URL.</p><p>Interestingly, this issue on Mastodon is also challenging Bluesky.</p><p>The post that started all of this is: <a href="https://aumetra.xyz/posts/the-fedi-ddos-problem" rel="nofollow noreferrer">https://aumetra.xyz/posts/the-fedi-ddos-problem</a>. Thanks to <a href="https://corteximplant.net/users/aumetra" rel="nofollow noreferrer">@aumetra</a> for writing it! Give it a read first; it provides some context.</p><p>My understanding (I'm stoopid, so don't quote me):</p><p>On Mastodon's end, the issue remains unsolved. Their stance is essentially: trust your instance to fetch the correct metadata. Yes, this approach can cause a traffic surge, but there's no better solution at the moment. Relying on clients would exacerbate the issue.<br>One potential solution could involve relays to centralize the fetching operation, offloading this task to a network of relays. This would centralize things a bit more while keeping options open: any instance could choose to use a relay or not. This would mitigate the issue, but not completely solve it.</p><p>On the AT / Bluesky side, when you create a post (using their lexicon "app.bsky.feed.post"), you can pass any metadata you want (<a href="https://docs.bsky.app/docs/advanced-guides/posts#website-card-embeds" rel="nofollow noreferrer">https://docs.bsky.app/docs/advanced-guides/posts#website-card-embeds</a>). Their stance is that if someone does something misleading, it will be reported.<br>Explanation of the “attack”: <a href="https://www.bentasker.co.uk/posts/blog/security/bluesky-posting-enables-misinformation-and-phishing-campaigns.html" rel="nofollow noreferrer">https://www.bentasker.co.uk/posts/blog/security/bluesky-posting-enables-misinformation-and-phishing-campaigns.html</a><br>Related discussion: <a href="https://github.com/bluesky-social/atproto/discussions/1304" rel="nofollow noreferrer">https://github.com/bluesky-social/atproto/discussions/1304</a></p><p>At the end of the day, you have to trust someone anyway: your Mastodon instance, your Mastodon/Bluesky client, the author of the post, or a centralized endpoint serving the metadata for you.</p><p><a href="https://oisaur.com/@renchap" rel="nofollow noreferrer">@renchap</a> wrote about this, offering various ideas to solve this: <a href="https://gist.github.com/renchap/3ae0df45b7b4534f98a8055d91d52186" rel="nofollow noreferrer">https://gist.github.com/renchap/3ae0df45b7b4534f98a8055d91d52186</a></p> <p>If I can offer my humble opinion after gathering some information on the issue (again, I'm stoopid), the best approach is to rely on the website. It's the source of everything.</p><p>First, having a cache in front of websites (like a CDN or reverse proxy) would prevent this issue entirely. But that's more of a workaround than a real solution. But like it's 2024... Come on.</p><p>Second, signing the metadata is a great idea as well, and it would again rely on the website as the source of truth.</p><p>However, another problem remains unsolved: what if someone maliciously publishes a post with tampered metadata? It would get federated as is.<br>So having a signature on the website's end would solve both problems: you can check the signature, and if it doesn't match, then fetch the metadata yourself. If it matches, you can simply republish it without further verification, and so, not hammering the website.</p><p>So my opinion is: trust the website first, then use a relay to balance the load, and finally rely on your own instance, just as it is today.</p><p>Give me your thoughts! I fell into a rabbit hole with this post, and I love how complex the issue is to solve.</p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/3681983#notice-7213816">In conversation</a><time datetime="2024-09-17T17:40:34+09:00" title="Tuesday, 17-Sep-2024 17:40:34 JST">about 8 months ago</time> <span>from <span><a href="https://eldritch.cafe/@alyx/113151735456280663" rel="external" title="Sent from eldritch.cafe via ActivityPub">eldritch.cafe</a></span></span><a href="https://eldritch.cafe/@alyx/113151735456280663">permalink</a><h4>Attachments</h4><ol><li><label><a rel="external" href="https://gnusocial.jp/attachment/196216">Untitled attachment</a></label><br></li><li><article><header><div>Domain not in remote thumbnail source whitelist: aumetra.xyz</div><h5><a href="https://aumetra.xyz/posts/the-fedi-ddos-problem">The Fediverse has a DDoS problem | aumetra's Scribbles</a></h5><div></div></header><div>The Fediverse has a problem with accidentally DDoS-ing remote service. What can we do about it?</div><footer></footer></article></li><li><article><header><div>No result found on File_thumbnail lookup.</div><h5><a href="http://context.My/">Context.my</a></h5><div></div></header><div></div><footer></footer></article></li><li><article><header><div>Domain not in remote thumbnail source whitelist: docs.bsky.app</div><h5><a href="https://docs.bsky.app/docs/advanced-guides/posts#website-card-embeds">Posts | Bluesky</a></h5><div></div></header><div>This is an in-depth dive into how creating a post works on Bluesky. We'll use Python below, without a SDK, so you can see how it works behind the scenes.</div><footer></footer></article></li><li><article><header><div>Domain not in remote thumbnail source whitelist: www.bentasker.co.uk</div><h5><a href="https://www.bentasker.co.uk/posts/blog/security/bluesky-posting-enables-misinformation-and-phishing-campaigns.html">Using BlueSky Features As Disinformation Tools</a></h5><div> from <span>@bentasker</span></div></header><div>Whilst working on automating posting into Bluesky I ran into an issue around link embed cards. BlueSky require that you manually define them, which means they're under the full control of the develope</div><footer></footer></article></li><li><article><header><div>Domain not in remote thumbnail source whitelist: opengraph.githubassets.com</div><h5><a href="https://github.com/bluesky-social/atproto/discussions/1304At">Have the PDS fetch and populate Link Card (external embeds) information, instead of client · bluesky-social/atproto · Discussion #1304</a></h5><div></div></header><div>Is your feature request related to a problem? Please describe. Currently the API allows a developer to customise everything in a Link Card or app.bsky.embed.external embed. This is frustrating for ...</div><footer></footer></article></li><li><article><header><div>Domain not in remote thumbnail source whitelist: github.githubassets.com</div><h5><a href="https://gist.github.com/renchap/3ae0df45b7b4534f98a8055d91d52186">Mastodon link previews draft.md</a></h5><div> from <span>renchap</span></div></header><div>GitHub Gist: instantly share code, notes, and snippets.</div><footer></footer></article></li></ol></footer></blockquote>

Corresponding Notice

Embed this notice
Alyx (alyx@eldritch.cafe)'s status on Tuesday, 17-Sep-2024 17:40:34 JSTAlyx
- Renaud Chaput
- Aumetra Ⓐ :hex_non_binary: :good_boy:
So, I fell down a rabbit hole. I learned that Mastodon can "DDoS" a server as thousands of instances fetch the metadata of a URL.
Interestingly, this issue on Mastodon is also challenging Bluesky.
The post that started all of this is: https://aumetra.xyz/posts/the-fedi-ddos-problem. Thanks to @aumetra for writing it! Give it a read first; it provides some context.
My understanding (I'm stoopid, so don't quote me):
On Mastodon's end, the issue remains unsolved. Their stance is essentially: trust your instance to fetch the correct metadata. Yes, this approach can cause a traffic surge, but there's no better solution at the moment. Relying on clients would exacerbate the issue.
One potential solution could involve relays to centralize the fetching operation, offloading this task to a network of relays. This would centralize things a bit more while keeping options open: any instance could choose to use a relay or not. This would mitigate the issue, but not completely solve it.
On the AT / Bluesky side, when you create a post (using their lexicon "app.bsky.feed.post"), you can pass any metadata you want (https://docs.bsky.app/docs/advanced-guides/posts#website-card-embeds). Their stance is that if someone does something misleading, it will be reported.
Explanation of the “attack”: https://www.bentasker.co.uk/posts/blog/security/bluesky-posting-enables-misinformation-and-phishing-campaigns.html
Related discussion: https://github.com/bluesky-social/atproto/discussions/1304
At the end of the day, you have to trust someone anyway: your Mastodon instance, your Mastodon/Bluesky client, the author of the post, or a centralized endpoint serving the metadata for you.
@renchap wrote about this, offering various ideas to solve this: https://gist.github.com/renchap/3ae0df45b7b4534f98a8055d91d52186

If I can offer my humble opinion after gathering some information on the issue (again, I'm stoopid), the best approach is to rely on the website. It's the source of everything.
First, having a cache in front of websites (like a CDN or reverse proxy) would prevent this issue entirely. But that's more of a workaround than a real solution. But like it's 2024... Come on.
Second, signing the metadata is a great idea as well, and it would again rely on the website as the source of truth.
However, another problem remains unsolved: what if someone maliciously publishes a post with tampered metadata? It would get federated as is.
So having a signature on the website's end would solve both problems: you can check the signature, and if it doesn't match, then fetch the metadata yourself. If it matches, you can simply republish it without further verification, and so, not hammering the website.
So my opinion is: trust the website first, then use a relay to balance the load, and finally rely on your own instance, just as it is today.
Give me your thoughts! I fell into a rabbit hole with this post, and I love how complex the issue is to solve.
In conversationabout 8 months ago from eldritch.cafepermalink
Attachments
1. Untitled attachment
2. Domain not in remote thumbnail source whitelist: aumetra.xyz
  The Fediverse has a DDoS problem | aumetra's Scribbles
  The Fediverse has a problem with accidentally DDoS-ing remote service. What can we do about it?
3. No result found on File_thumbnail lookup.
  Context.my
4. Domain not in remote thumbnail source whitelist: docs.bsky.app
  Posts | Bluesky
  This is an in-depth dive into how creating a post works on Bluesky. We'll use Python below, without a SDK, so you can see how it works behind the scenes.
5. Domain not in remote thumbnail source whitelist: www.bentasker.co.uk
  Using BlueSky Features As Disinformation Tools
  from @bentasker
  Whilst working on automating posting into Bluesky I ran into an issue around link embed cards. BlueSky require that you manually define them, which means they're under the full control of the develope
6. Domain not in remote thumbnail source whitelist: opengraph.githubassets.com
  Have the PDS fetch and populate Link Card (external embeds) information, instead of client · bluesky-social/atproto · Discussion #1304
  Is your feature request related to a problem? Please describe. Currently the API allows a developer to customise everything in a Link Card or app.bsky.embed.external embed. This is frustrating for ...
7. Domain not in remote thumbnail source whitelist: github.githubassets.com
  Mastodon link previews draft.md
  from renchap
  GitHub Gist: instantly share code, notes, and snippets.

Public

Embed Notice

HTML Code

Corresponding Notice