Conversation
Notices
-
Embed this notice
feld (feld@bikeshed.party)'s status on Thursday, 29-Aug-2024 03:11:34 JST feld We're having problems with link previews / rich media and there's nothing we can do about it. I suspect Mastodon instances have the same problem but just haven't noticed yet.
Originally we had issues with link previews to social media sites like Twitter, Facebook, etc. The OpenGraph/TwitterCard meta tags were not there unless you had 'Bot' in your User Agent string or in some cases you literally needed 'Twitterbot' in there. This is how Telegram works actually -- their User Agent is 'Telegrambot (like Twitterbot)'.
This doesn't work anymore. I see a ton of 403s to normal websites.
I try with curl, change user agent, etc. Always a 403. The error message says you need to enable javascript and disable your ad blocker. It also returns an HTTP header that indicates they think you're a web scraper.
But how does this work for Telegram etc then? Well, their IP ranges must be on a whitelist at the CDN level.
I've confirmed Fastly and CloudFlare are doing this but I don't know how it works. It's probably an opt-in anti-abuse feature that companies are enabling.-
Embed this notice
Haelwenn /элвэн/ :triskell: (lanodan@queer.hacktivis.me)'s status on Thursday, 29-Aug-2024 03:11:33 JST Haelwenn /элвэн/ :triskell: @feld
> It also returns an HTTP header that indicates they think you're a web scraper.
Which in a way is true :D -
Embed this notice
Haelwenn /элвэн/ :triskell: (lanodan@queer.hacktivis.me)'s status on Thursday, 29-Aug-2024 03:25:02 JST Haelwenn /элвэн/ :triskell: @feld And from a technical perspective, Fediverse link previews in their current design is undistinguishable from a low-bandwidth DDoS botnet which would all go hit URLs at roughly the same time.
User-Agents can't help, those are too trivial to spoof. -
Embed this notice
(mint@ryona.agency)'s status on Friday, 30-Aug-2024 00:08:09 JST @feld I remember Gleason changed the useragent to WhatsApp's one for link preview fetching which apparently worked for some sites that were blocking other useragents. -
Embed this notice
feld (feld@bikeshed.party)'s status on Friday, 30-Aug-2024 00:15:31 JST feld @mint I tested it works with Telegram, used Telegram's user agent -- no dice.
Tried from residential connections and a few different datacenters, IPv4 and IPV6. They all get the 403. likes this. -
Embed this notice
feld (feld@bikeshed.party)'s status on Friday, 30-Aug-2024 00:15:38 JST feld @mint yeah as far as I can tell the user agent tricks don't work anymore. They're doing fingerprinting on the raw traffic plus IP address / AS lookups and blocking you with a 403 if they're pretty sure it's not a real browser likes this. -
Embed this notice
pomstan (pomstan@xn--p1abe3d.xn--80asehdb)'s status on Friday, 30-Aug-2024 00:52:46 JST pomstan likes this. -
Embed this notice
Exterminatus (ex@utih.net)'s status on Friday, 30-Aug-2024 04:23:05 JST Exterminatus @pomstan More like copro net.
@feld @mint likes this.
-
Embed this notice