@ricferrer@julian@rimu@piefed.social@evan Instead of putting web+ap links everywhere, PieFed just silently rewrites urls in post bodies so they go to the local copy of each post, if we have it.
In the threadiverse every instance has every post so this works pretty well.
@tchambers Mastodon just backfills on demand (when someone views a post) so their solution just papers over the issue.
When receiving a post from outside Mastodon already does a GET for links in a post in order to construct the preview card (using open graph metadata) so they could, at that point, realize that it is a fedi link, backfill the link to constrict a local copy of the post (if necessary) and then change the original post link to that new copy.
You can test it on https://crust.piefed.social as that is the bleeding-edge instance - it's not formally released yet.
On the threadiverse nearly everything is federated to nearly everyone so nearly all instances have most content. This makes it a simple matter of detecting links that go to fediverse instances and rewriting them to go to the local copy.
It wouldn't work so great with links to Mastodon content because chances are we wouldn't have it locally.
In the past I've coded lots of things for blocking various things, which is all about excluding what we don't want.
I'm finding it much harder to think of ways to attract and promote things we DO want.
In an ideal utopia we'd be making social media that makes people feel better. Not neutral or the same as before and certainly not worse than before. I don't think we can get there by excluding bad things, we need to be including/nurturing/promoting good things too.
Now you can tell at a glance that the link goes to a post in a community called 'not the onion' and is something about an official selling their house.
Hard to imagine how this could work while still allowing usage by regular people AND avoiding scrapers that pretend to be regular people. But let's see.
This is just robots.txt on steroids in the sense that it's entirely opt-in and only binds law-abiding actors. It has no answer to the badly-behaved scrapers that ignore robots.txt and overwhelm our instances.
Having said that it will still be great to have a way to bill the 'good' crawlers and I appreciate the lightweight and simple methods they propose. It might work.
Or it could just mean the incentive for crawlers to spoof user-agents is higher...
The master list of PieFed instances is retrieved from the fediverse.observer API. Then an API endpoint on every instance in the list is queried to see if they opted in and to get information about the instance that fediverse.observer does not have.
This happens once per day.
Instance admins can choose to filter out any other instance and defederated ones are automatically filtered.
I'm hoping to maximize the dispersal of new user across all the instances so wanted to keep other options visible from the beginning.
Can't wait to roll this out more widely. A key part of it is that it's built-in to every #PieFed instance, not a separate web site. So it's power comes from everyone being on board.