@schnouki RIGHT?! Like I was reading a successful geeky blog 15 years ago and the author was talking about implementing some cache strategy. The server was just an old computer @ home. It's not that hard. It's even funnier when they actually deployed a CDN and apparently don't know how to use it. And blaming one of the most popular alternative to centralized social media for not solving this (non)issue ASAP like it's an easy thing and a priority for them 😮💨.
Tbh the comments on Mastodon, both here and on GitHub, are really harsh. There are like 3 people working full-time on the core software, and everyone believes they owe them everything, and they are never fast enough to ship anything and yadi yada. I would quit after a day.
My understanding (I'm stoopid, so don't quote me):
On Mastodon's end, the issue remains unsolved. Their stance is essentially: trust your instance to fetch the correct metadata. Yes, this approach can cause a traffic surge, but there's no better solution at the moment. Relying on clients would exacerbate the issue. One potential solution could involve relays to centralize the fetching operation, offloading this task to a network of relays. This would centralize things a bit more while keeping options open: any instance could choose to use a relay or not. This would mitigate the issue, but not completely solve it.
At the end of the day, you have to trust someone anyway: your Mastodon instance, your Mastodon/Bluesky client, the author of the post, or a centralized endpoint serving the metadata for you.
If I can offer my humble opinion after gathering some information on the issue (again, I'm stoopid), the best approach is to rely on the website. It's the source of everything.
First, having a cache in front of websites (like a CDN or reverse proxy) would prevent this issue entirely. But that's more of a workaround than a real solution. But like it's 2024... Come on.
Second, signing the metadata is a great idea as well, and it would again rely on the website as the source of truth.
However, another problem remains unsolved: what if someone maliciously publishes a post with tampered metadata? It would get federated as is. So having a signature on the website's end would solve both problems: you can check the signature, and if it doesn't match, then fetch the metadata yourself. If it matches, you can simply republish it without further verification, and so, not hammering the website.
So my opinion is: trust the website first, then use a relay to balance the load, and finally rely on your own instance, just as it is today.
Give me your thoughts! I fell into a rabbit hole with this post, and I love how complex the issue is to solve.
But like, seriously, in 2024, having an issue with a traffic surge causing downtime on a website?? Coming from a “huge” one like itsfoss.com (https://news.itsfoss.com/mastodon-link-problem/) is really worrying.