Hey, all. Thanks very much to everyone who responded to the poll, and thanks to everyone who had no idea what the question meant and were patient waiting for an explanation. Here it comes!
So, a brief tutorial of how ActivityPub activities are delivered: every actor on the network has a dedicated API endpoint called their `inbox`. When another actor wants to send them something, the content is posted to the actor's `inbox` using HTTP.
Delivering to hundreds or thousands of different actors can take a lot of time and resources from the sending server.
One optimization is if a lot of actors are on the same receiving server. If that happens, the sending server can send the activity *once* to the receiving server, and the receiving server can route the activity to the inbox of each actor internally.
So, if an activity is supposed to go to M actors, it would normally take about M deliveries to get it to all those actors. But if there are on average N actors that share a server, then using a shared inbox will reduce the total number of deliveries to M/N.
In ActivityPub, each actor has an optional `sharedInbox` endpoint, that is used for this shared inbox delivery. If an activity is supposed to be delivered to 10 actors on the same receiving server, it only gets delivered once, and the receiving server routes it to those 10 actors directly. We expect that internal routing to be much faster!
Another issue with ActivityPub is that we have a special delivery address, `Public`, which means that the activity should be visible to _everyone_. One question that arises is, if it's visible to everyone, should it be *delivered* to "everyone"? And if so, how is "everyone" defined?
@evan I've been impressed seeing how fast edits to posts are propagated, like when I fix and error or even change an image from my WordPress ActivtyPub host, the change seems fast.
So each load of my timeline is *not* making repeated external requests?
One answer to that is that the sending server should send the activity to every actor on the ActivityPub network. That's clearly too much -- literally millions of actors. There's also not a big list you can look at that shows every actor on the network.
Another option is that the sending server should send the activity to every *server* on the network, instead. So, if there are on average N actors per server on the network, it's (all users)/N, which might be tractable, although it's still pretty big on the modern Fediverse (~30,000 servers, according to fedidb.org). But, again, there's not a big list of all the available servers that the sending server can use for publishing to.
For ActivityPub, we didn't say that public activities should be sent to every server. Instead, we said that they should be sent to every server that the sending server knew about -- usually because an actor on the sending server had interacted with an actor on the receiving server.
I'd like to point out here that these two problems -- optimizing delivery by server rather than by actor, and widely propagating public activities -- are related but not identical.
In the development of ActivityPub, we originally had an endpoint called `publicInbox` for the second problem -- propagating public activities. But when we started working on the optimization issue, we defined a `sharedInbox` also. Here's the key point: we actually combined both features in one endpoint.
So, in the ActivityPub spec, talking about the public propagation of activities, we used this language:
> Additionally, if an object is addressed to the Public special collection, a server MAY deliver that object to all known sharedInbox endpoints on the network.
That "MAY" is important. It says that the sending server *can* do it, but it doesn't *have to*. This leaves it open to interpretation by implementers.
@evan Having heard of ActivityPub relay servers, it seems to me that relay server owners and regular instance owners can pick and choose what to broadcast (including filtering algorithmically), which relay servers to broadcast to, and which relay servers to receive broadcasts from.
So in other words, don't have regular instances broadcast to regular instances, but rather use relays for that purpose, which then allows distributed choosing.
So, that gets down to the question in this poll: "Should ActivityPub servers send public activities to all known `sharedInbox` endpoints on the network?" Implementers actually have a choice.
There are upsides and downsides to this kind of public propagation. The upside is that public activities are spread around to many more servers. That opens up conversations to more people, makes things more lively and interesting on the Fediverse. It makes the Fediverse feel more like one big place.
The other downside is for the receiving servers. There are millions of actors on the Fediverse, publishing all kinds of activities -- new content, likes, shares, and so on. The receiving servers would be getting that full firehose to their `sharedInbox` endpoint -- the vast majority of which was not interesting to any users on that server.
The downsides are resource usage. For the sending server, it's got to deliver public activities to every server it's interacted with. That could be thousands or tens of thousands of servers! If that's happening for every public activity, it's a real drag on resources.
Importantly, the receiving servers can't get the optimization of using a `sharedInbox` for reducing the number of requests they get without getting this public propagation feature, too. For servers with relatively few actors, the resources they save by having a shared inbox for actors might be vastly outweighed by the resources used to receive public activities!
There's also no way to save resources by only accepting shared delivery and ignoring public propagation. By the time the receiving server can tell the difference between a shared delivery and a public delivery, most of the resources have already been used.
So, my answer to the question is "No, But". Not right now -- it's too much resource requirements being pushed on servers, with very little payoff for those servers.
But, I think having public propagation is useful. I think we should separate out the `sharedInbox` functionality into two new endpoints, `sharedInboxOnly` and `publicInbox` (or something like that). Servers that have the capacity and the need can implement the public propagation feature; servers that don't can skip it.
@virtuous_sloth Meh. Relays are a Mastodon-only feature -- they're not part of the ActivityPub standard. I think the topology of an opt-in pool of servers that all share their public activities is fine, though.
@cogdog Maybe what you're asking is, when someone posts a new Note, and you read it in your timeline, does your client (or server) go fetch it from the server it originated on?
The answer is usually no. When your server receives information about new or updated content from other servers, it usually keeps a cache of that content locally. So, when your client asks the server for your timeline, it just gets the data out of the cache, and doesn't fetch it from the remote server.