Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://infosec.town/notes/a1r68yw8ukl0hj7i">Blake Leonard (blake@infosec.town)'s status on Saturday, 14-Dec-2024 17:36:45 JST</a><a href="https://infosec.town/@blake" title="blake@infosec.town"><img src="https://gnusocial.jp/avatar/180927-48-20231002085246.webp" width="48" height="48" alt="Blake Leonard" style="position: absolute; left: 0; top: 0;">Blake Leonard</a><div><ul><li></ul></div></section><article><p>I'm reading <a href="https://social.coop/@cwebber">@cwebber</a>'s <a href="https://dustycloud.org/blog/re-re-bluesky-decentralization/">Re: Re: How decentralized is Bluesky, really?</a>, and something she said leapt out to me here: If the only options in town are Burger King and McDonalds for food, one may have a degree of options and choice, but this really isn't satisfying to assuage my concerns, even if Taco Bell comes into town.This comparison here strikingly reflects, on the nose!, where the very real and valid concerns about Bluesky come from. ATProto in general does not fit this analogy, but the ATProto microblogging space is the exclusive domain of Bluesky. It's incredibly expensive to run, with all of the practically-centralized parts that the microblogging profile requires. Tracking replies, for example, is quite difficult to do in ATProto, and they solve this by having the Bluesky AppView (the hosted server) remember every post it comes across. That's how it knows about things like the "real" creation date of posts.<br><br>Yes, you can host your own PDS (which is the "home" for your data). You cannot run your own relay or Bluesky AppView, so to effectively microblog on ATProto you NEED Bluesky. This need not be true, but at present it is. It will take, at least, a new AppView project with a rethought architecture to create a resource-efficient alternative.<br><br>(Northwood, a new toy project of mine which is an ATProto-based forum software, is planned to approach this differently: each Northwood instance hosts some tracking metadata for each post, such as what category it's in, when it was "added" to the forum, what the last known state was, etc. It'll cache posts for a short time, to make things smoother, but it doesn't <i>store</i> them, and it never needs to pull the full repository.)What's missing from Mark's piece altogether is "Enable Participation". Yes, email has re-centralized. But we should be upset and alarmed that it is incredibly difficult to self-host email these days. This is a real problem. It's not unjustified in the least to be upset about it. And work to try to mitigate it is worthwhile.🎯🎯🎯<br><br>---<br><br>I'm down to the part where she started using math and I feel completely lost, but I think there's a fundamental misunderstanding of how these protocols work and are used in practice -- either mine or hers. My understanding is: for the distribution of posts on Fedi, distributing 1 post to followers on 30 different instances means that the instance has to send the post <i>at least</i> 30 times (or just hope that any of the instances you DO send it to, are able and willing to re-forward to the other ones). The receiving instance then has to save it to their database and distribute notifications to users. On ATProto, making 1 post means you have about 2 round-trip network requests: one from the client to the PDS, and one from a relay to the PDS. Let's pretend there are 30 Bluesky AppView instances that each of your followers are on. Each instance <i>receives</i> the post from whatever relay(s) they're connected to, and saves it to their database. Nothing is outgoing from the AppViews, but EVERYTHING is outgoing from the relay.<br><br>This is pretty similar to Nostr's model, and the only bit of it that I actually liked (okay the addressing thing was nice too, but ATProto's and Bluesky's approach to domain-based handles beat that out), except there is only one relay and for some reason relays are also expected to archive/copy full repositories (account data)? Although they have realized that's unsustainable and I think they've turned off their archival bits, last I've heard.<br><br>I think a lot of the problems here stem from Bluesky, not ATProto, being poorly designed. There may yet be more efficient alternative approaches -- maybe my understanding is wrong but still a better approach?<br><br>Ideas:<br>* Relays should mirror/fanout other relays.<br>* Investigate everything that requires a full copy of a repository. This should not be needed for anything, not even storage. <a href="https://misintelligence.xyz/want-to-love-xmpp/#federation">Matrix (disclaimer: my article)</a> and <a href="https://devblogs.microsoft.com/bharry/the-largest-git-repo-on-the-planet/">Git (MS blog)</a> are fantastic examples of why this type of storage is a bad idea.<br>* AppViews need not be ready to handle every user. They should handle the posts, notifications, and followed posts of the people who actually use it -- Mastodon does this, where it stops building feeds for people who haven't logged on in a month.<br>* For small-world use cases, an architecture where an AppView directly crawls the repos of the follows of its users. The only thing this doesn't account for is notifications ("no missed messages"), I think...<br><br>this is giving me a headache</p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/4183960#notice-8176855">In conversation</a><time datetime="2024-12-14T17:36:45+09:00" title="Saturday, 14-Dec-2024 17:36:45 JST">about 4 months ago</time> <span>from <span><a href="https://infosec.town/notes/a1r68yw8ukl0hj7i" rel="external" title="Sent from infosec.town via ActivityPub">infosec.town</a></span></span><a href="https://infosec.town/notes/a1r68yw8ukl0hj7i">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
Blake Leonard (blake@infosec.town)'s status on Saturday, 14-Dec-2024 17:36:45 JSTBlake Leonard
- Cassandra Lemmer-Webber
I'm reading @cwebber's Re: Re: How decentralized is Bluesky, really?, and something she said leapt out to me here: If the only options in town are Burger King and McDonalds for food, one may have a degree of options and choice, but this really isn't satisfying to assuage my concerns, even if Taco Bell comes into town.This comparison here strikingly reflects, on the nose!, where the very real and valid concerns about Bluesky come from. ATProto in general does not fit this analogy, but the ATProto microblogging space is the exclusive domain of Bluesky. It's incredibly expensive to run, with all of the practically-centralized parts that the microblogging profile requires. Tracking replies, for example, is quite difficult to do in ATProto, and they solve this by having the Bluesky AppView (the hosted server) remember every post it comes across. That's how it knows about things like the "real" creation date of posts.

Yes, you can host your own PDS (which is the "home" for your data). You cannot run your own relay or Bluesky AppView, so to effectively microblog on ATProto you NEED Bluesky. This need not be true, but at present it is. It will take, at least, a new AppView project with a rethought architecture to create a resource-efficient alternative.

(Northwood, a new toy project of mine which is an ATProto-based forum software, is planned to approach this differently: each Northwood instance hosts some tracking metadata for each post, such as what category it's in, when it was "added" to the forum, what the last known state was, etc. It'll cache posts for a short time, to make things smoother, but it doesn't store them, and it never needs to pull the full repository.)What's missing from Mark's piece altogether is "Enable Participation". Yes, email has re-centralized. But we should be upset and alarmed that it is incredibly difficult to self-host email these days. This is a real problem. It's not unjustified in the least to be upset about it. And work to try to mitigate it is worthwhile.🎯🎯🎯

---

I'm down to the part where she started using math and I feel completely lost, but I think there's a fundamental misunderstanding of how these protocols work and are used in practice -- either mine or hers. My understanding is: for the distribution of posts on Fedi, distributing 1 post to followers on 30 different instances means that the instance has to send the post at least 30 times (or just hope that any of the instances you DO send it to, are able and willing to re-forward to the other ones). The receiving instance then has to save it to their database and distribute notifications to users. On ATProto, making 1 post means you have about 2 round-trip network requests: one from the client to the PDS, and one from a relay to the PDS. Let's pretend there are 30 Bluesky AppView instances that each of your followers are on. Each instance receives the post from whatever relay(s) they're connected to, and saves it to their database. Nothing is outgoing from the AppViews, but EVERYTHING is outgoing from the relay.

This is pretty similar to Nostr's model, and the only bit of it that I actually liked (okay the addressing thing was nice too, but ATProto's and Bluesky's approach to domain-based handles beat that out), except there is only one relay and for some reason relays are also expected to archive/copy full repositories (account data)? Although they have realized that's unsustainable and I think they've turned off their archival bits, last I've heard.

I think a lot of the problems here stem from Bluesky, not ATProto, being poorly designed. There may yet be more efficient alternative approaches -- maybe my understanding is wrong but still a better approach?

Ideas:
* Relays should mirror/fanout other relays.
* Investigate everything that requires a full copy of a repository. This should not be needed for anything, not even storage. Matrix (disclaimer: my article) and Git (MS blog) are fantastic examples of why this type of storage is a bad idea.
* AppViews need not be ready to handle every user. They should handle the posts, notifications, and followed posts of the people who actually use it -- Mastodon does this, where it stops building feeds for people who haven't logged on in a month.
* For small-world use cases, an architecture where an AppView directly crawls the repos of the follows of its users. The only thing this doesn't account for is notifications ("no missed messages"), I think...

this is giving me a headache
In conversationabout 4 months ago from infosec.townpermalink

Public

Embed Notice

HTML Code

Corresponding Notice