> However, I disagree with some of the analysis, and have a couple specific points to correct.
Well this wouldn't be a 20 page response to a response if @bnewbold and I agreed with everything off the bat now would it
> However, I disagree with some of the analysis, and have a couple specific points to correct.
Well this wouldn't be a 20 page response to a response if @bnewbold and I agreed with everything off the bat now would it
One thing @bnewbold did agree on is that "shared heap" and "message passing" are useful distinctions.
In fact I've seen members of the Bluesky team use "shared heap" a few times since to explain their tech since, and many people replied saying this distinction was illuminating. I'm really glad!
Now the reality is that "message passing" is hardly a new term
As far as I know though, I did introduce the term "shared heap"
I didn't know what else to call it! Wait actually no that's not actually fully true
If I was going to refer to the CS literature I would probably say that "ActivityPub uses an actor model style approach whereas ATProto uses a global, public, shared tuplespace"
But I wanted the mail metaphor to work and I was pretty sure everyone's eyes would glaze over at "tuplespace"
Anyway, language is this messy squishy thing but part of the success of the previous post I think was terms that allowed us to discuss differences clearly.
(And it is EXACTLY for that same reason that I am gonna dive into analyzing terminology deeper in just a little bit.)
Moving on...
Let's talk about some acknowledgement of scaling expectations, starting with this terminology, happened in @bnewbold's response because I think it's very helpful!
(BTW whenever I quote something with ">" in this thread, if I don't otherwise specify, I'm quoting @bnewbold)
> Other data transfer mechanisms, such as batched backfill, or routed delivery of events (closer to "message passing") are possible and likely to emerge. But the "huge public heap" concept is pretty baked-in.
Okay this is helpful. This sets expectations. This is good to acknowledge.
> Given our focus on big-world public spaces, which have strong network effects, our approach is to provide a "zero compromises" user experience. We want Bluesky (the app) to have all the performance, affordances, and consistency of using a centralized platform.
This is also good to acknowledge.
> So, yes, the atproto network today involves some large infrastructure components, including relays and AppViews, and these might continue to grow over time. Our design goal is not to run the entire network on small instances.
Okay yes, yes this is good to ack
> It isn't peer-to-peer, and isn't designed to run entirely on phones or Raspberry Pis. It is designed to ensure "credible exit", adversarial interop, and other properties, for each component of the overall system.
Good okay thank you
> Operating some of these components might require collective (not individual) resources.
Hm okay, this is also good. Okay remember this sentence. This sentence is gonna be really important in just a minute.
But before we get there oh hey, when I wrote my last blogpost I said "whoa in just 4 months storage expectations jumped from 1TB to 5TB. I bet in a month it'll be double, at least 10TB."
Whoops I underestimated, @bnewbold says in his post it's now at least 16TB. Growin' fast!
@bnewbold also mentions new initiatives like Jetstream and other tooling that provide a lighter experience
well, and that's true! ... though that's done by weakening the "zero compromises experience" quite a bit if you wanted to use them to "self host", more on that later
Now okay remember when I said "this sentence is gonna be important"
You've forgotten it already?
Okay fine I'm gonna quote it again
> Operating some of these components might require collective (not individual) resources.
Okay don't forget it this time! Don't forget it!
> This doesn't mean only well-funded for-profit corporations can participate! There are several examples in the fediverse of coop, club, and non-profit services with non-trivial budgets and infrastructure.
This is certainly true on the fediverse, I am hosted by a co-op. Thank you social.coop 💜
(@bnewbold is also on social.coop!)
> Organizations and projects like the Internet Archive, libera.chat, jabber.ccc.de, Signal, Let's Encrypt, Wikipedia [...], the Debian package archives, and others all demonstrate that non-profit orgs have the capacity to run larger services.
Wait a minute hold on
> Many of these are running centralized systems, but they could be participating in decentralized networks as well.
no wait but wait back up hold on what was that list again
Ok, XMPP and IRC are mostly ephemeral text and I love them, but let's be honest, they're pretty niche and on the decline
I've just... wait a minute we've got to look at some of the org choices here
What are the annual budgets of these FOSS service-hosting orgs?
- Wikimedia: $178 million/year
- Signal: $50 million/year
- Let's Encrypt/ISRG: $7 million/year
- Internet Archive: $25 million/year
This is public information, you can look this up! Read their 990s.
This is all to say, this is not your neighborhood block getting together to pitch in a few bucks to help out their FOSS friends
These are great orgs and compared to large for-profits, these orgs are efficient and use their money well
But these are SIZABLE hosting costs, and NOT easy to fundraise
I say this, by the way, as an Executive Director of a FOSS nonprofit with a much smaller budget and also oh god I hate fundraising I promised myself I would never do a fundraising job again why am I doing this
Did I mention we're doing a fundraiser? https://spritely.institute/donate/
Just sayin' ;_;
People worry about wasteful funding, and right now FOSS organizations are losing many of the funding sources they have. Project 2025 specifically targeted taking the incredibly small amount of money that FOSS orgs get from governments
Fundraising is the worst and it's so hard to fund anything
My friend @n8fr8 of the Guardian Project likes to point at Signal's budget and say "yeah that looks big, but you know how much the government spends on each fighter jet?" and it's some unimaginably large number, like *hundreds* of millions of dollars per jet
Signal is the cost of a jet wing
Anyway we should give Signal the jet wing money
Can someone get @spritely some of the jet wing money?
Anyway you'd think if you were upset about the government "taking your tax money" you'd at least want to get something out of it and FOSS helps everyone so this is so frustrating
So that's all to say that I think the choice of these orgs is pretty interesting because when you say "oh a bunch of FOSS nonprofits host community infrastructure" we're not talking social.coop costs with a bunch of these we're talking jet wing money
It's really hard to get that jet wing money
Anyway I'll stop talking about the jet wing money I promise
jet wing money jet wing money jet wing money
Please give FOSS nonprofits jet wing money
But anyway THE POINT IS what kinda scale are we thinking about? What's your frame of reference? Fediverse co-op? Or Signal?
But speaking of running FOSS nonprofits I now have an EXCITING MEETING about administrative duties of running my FOSS nonprofit
So, it is time for a... MEETING BREAK (like, an hour)
Followed by a tea break. (like, 10 minutes)
==== MEETING AND TEA BREAK HERE ====
Okay, I'm back from my meeting. I also have tea.
We're about to get to the first REALLY substantial part, which is terminology. Is it fair to call Bluesky "decentralized" or "federated"?
Both @bnewbold and I provided definitions and we are going to COMPARE and ANALYZE
Before we go any further I am just gonna say, I miss hiding the easter eggs, but I don't think I can do that again
If you know anything about my projects you know that I love goblins. Have for a long time. When we launched MediaGoblin I would get people saying "nobody will ever like goblins"
WELL
Now we live in an era of "Goblincore" and people self-describing as Goblins
I am pleased. And I am pleased to be into Goblins before they were cool.
The Goblin theme continues at Spritely as you may know
But if you've read this far, let me know that you found Secret Goblin #1 😈
So, is Bluesky decentralized? Is it federated?
In my previous blogpost, I concluded that Bluesky was not either.
@bnewbold conceded that maybe Bluesky does not meet *my* definitions, but provides some alternative definitions, which maybe it does meet
Were my definitions too strong or unfair?
@bnewbold declares he will "choose his own fighter" and selects Mark Nottingham's independent IETF submission, RFC 9518: Centralization, Decentralization, and Internet Standards
https://datatracker.ietf.org/doc/rfc9518/
It's an interesting document, and it turns out, has some interesting context
Bryan cites Mark's definition of *centralization* (which I hadn't defined!):
> [...] "centralization" is the state of affairs where a single entity or a small group of them can observe, capture, control, or extract rent from the operation or use of an Internet function exclusively.
Good so far!
However it's time to compare definitions of *decentralization*. First mine:
> Decentralization is the result of a system that diffuses power throughout its structure, so that no node holds particular power at the center.
I stand by this!
Now here is Bryan's definition (more accurately Mark Nottingham's definition (more accurately, Paul Baran's definition)) of decentralization:
> [Decentralization is when] "complete reliance upon a single point is not always required" (citing Baran, 1964)
Uh, hm... this seems... pretty weak?!
This definition of decentralization is so weak it may as well say "Users occasionally not rely on a central gatekeeper, as a treat"
It's pretty weak, and yeah Bluesky qualifies, but that's... I'm gonna be honest that's an *incredibly* weak definition by comparison
Let's look at the delta between my definition of decentralization and the one chosen by Bryan:
- The discussion of power dynamics, and diffusion thereof, is removed
- The "phrase complete" reliance is introduced, so incomplete reliance is now ok
- And not only that, now it's "not always required!"
In my previous blogpost I had expressed worry about moving the goalposts of "decentralization". That is *exactly* what's happening here, and what's being said is "if we weaken the definition dramatically, then Bluesky qualifies"
This is, IMO, not a very compelling look I've gotta say
Now you might notice this citation [Baran, 1964] and hey if you work on network things you might be thinking "Hey Christine, wait isn't this one of the seminal papers on networking which led to the internet?"
GOOD QUESTION LET'S COME BACK TO THAT
The context is CRITICAL.
Back to that in a moment.
Okay so "decentralization", maybe Bluesky qualifies if we use an unimaginably weaksauce definition that's so loose you don't even have to comply with it hardly at all?
So okay now let's compare definitions of "federation".
My definition:
> [Federation] is a technical approach to communication architecture which achieves decentralization by many independent nodes cooperating and communicating to be a unified whole, with no node holding more power than the responsibility or communication of its parts.
Bryan's definition (more accurately Mark Nottingham's definition):
> [...] federation, i.e., designing a function in a way that uses independent instances that maintain connectivity and interoperability to provide a single cohesive service.
Hm okay, well these don't look quite as far apart, right?
So what's the delta?
- The discussion of power dynamics, once again, is not present.
- "Cooperation" is not present.
- And very specifically, "decentralization" and "no node holding more power than the responsibility or communication of its parts" is not present.
Turns out this has a big effect.
Re-read and compare. Under that last definition, even corporate but proprietary internal microservice architectures or devops platforms would qualify as federated!
Maybe? But it's not federation in a *decentralization* context.
(That last observation is thanks to @vv btw, good observation from a good gf)
Bryan then acknowledges it's a comparatively low bar:
> What about federation? I do think that atproto involves independent services collectively communicating to provide a cohesive and unified whole, which both definitions touch on, and meets Mark's low-bar definition.
@shtrom email is absolutely federated per my previous blogpost
@cwebber Side adventure: is email federated by that definition?
But actually in this case, Mark Nottingham's definition happened directly within a context talking about "decentralization mechanisms", enough so that maybe it was stronger in the RFC. I dunno.
More comments on the RFC in a bit though.
However, Bryan does concede the following:
> Overall, I think federation isn't the best term for Bluesky to emphasize going forward, though I also don't think it was misleading or factually incorrect to use it to date.
Well okay, actually that's quite the thing to concede, so massive props on that
Bryan also in that same paragraph goes on to mention some very interesting history about Bluesky's earlier prototypes and how the design changed. Worth reading btw. But that's an aside, kinda.
It seems that there might be more of a concession here that Bluesky isn't federated, so the bigger question really is whether or not it's decentralized.
I mentioned that the definition is interesting in context and BOY is it interesting in context, oh gosh oh boy
Hey remember earlier when I said this thing:
> now here is Bryan's definition (more accurately Mark Nottingham's definition (more accurately, Paul Baran's definition)) of decentralization
Did you notice all the parentheses? That's not JUST because I love lisp
I mean I do love lisp
But not only
We need to understand Mark Nottingham's RFC and we need to understand Paul Baran's seminal 1964 paper both, within the contexts they were written, before we can pull this quote-of-a-quote out.
So let's start with the RFC.
If you hear "Respected standards technologist Mark Nottingham's independent IETF RFC 9518: Centralization, Decentralization, and Internet Standards", what do you think you'll find inside?
I'll tell you what I'd expect
Rah rah decentralization!! The internet was meant to be free!!!
Well...
You should read the RFC yourself, here it is: https://datatracker.ietf.org/doc/rfc9518/
Mark Nottingham is a respected, accomplished standards author. And with good reason. Most of his work history is representing standards for big corporate players.
That's how most of it is these days, actually
The surrounding context of the RFC is a debate within the IETF and elsewhere: gosh! this internet! it sure seems to have centralized a *lot*, is this really what we wanted to happen to it? This wasn't the original vision!
Shouldn't standards orgs do something to fix it?!
Well should they?
Mark Nottingham's own words answer better than I do, and you should read the RFC. It's not quite one way or the other. It's kind of a "well decentralization is great and yeah centralization is bad but how realistic is decentralizing things anyway and when?"
But Mark's own words handle it better
From the RFC:
> This document argues that, while decentralized technical standards may be necessary to avoid centralization of Internet functions, they are not sufficient to achieve that goal because centralization is often caused by non-technical factors outside the control of standards bodies. As a result, standards bodies should not fixate on preventing all forms of centralization; instead, they should take steps to ensure that the specifications they produce enable decentralized operation.
Let me emphasize a sentence there for you:
> standards bodies should not fixate on preventing all forms of centralization
That is the crux of this RFC
It's an interesting read, it's very thoughtful, it analyzes from many angles. It's worth reading! But that is the broad sweep of RFC 9518.
Mark examines centralization's effects from multiple angles. He has a *great* section called "Centralization Can Be Harmful". Covers the general ground.
But it's immediately followed by "Centralization Can Be Helpful"!
This is not a radical pro-decentralization RFC, is what I'm saying.
Mark does address the radicals:
> Many engineers who participate in Internet standards efforts have an inclination to prevent and counteract centralization because they see the Internet's history and architecture as incompatible with it.
So true bestie, that's me you're describing
While Mark analyzes both, his position is ultimately that of someone who does care about standards, but takes a kind of pragmatism that hey, look, decentralization, it's a great goal, but it's pretty hard, and maybe actually centralization is pretty helpful too, let's not go too wild here
The history of the internet and the web *is* of big dream believers making big strides. The internet has been moving away from that, and it's getting harder to participate in standards without being a big corporate player. (Trust me, I know *all too well.*)
So, *should* standards orgs do something?
As a side note on the thread on the other place, Bluesky dropped one of my replies and literally refuses to pull it up for me even though it acknowledges it's there
I have the worst time navigating replies on Bluesky, sometimes I send people threads and they say "I don't see the reply you're talking about there"
Dear god for all the claims of ATProto and Bluesky having a big deal of no missing replies it's really frustrating dealing with replies on Bluesky's UX
Anyway...
Anyway Mark, tell us, what should standards orgs do?
> Centralization and decentralization are increasingly being raised in technical standards discussions. Any claim needs to be critically evaluated. As discussed in Section 2, not all centralization is automatically harmful. Per Section 3, decentralization techniques do not automatically address all centralization harms and may bring their own risks.
Note this framing: centralization is not necessarily harmful, decentralization may not address problems and may cause new ones.
Rather than a rallying cry for decentralization, it's a call to preserve the increasing status quo: yes, it's worrying large corporations are centralizing the internet, but should *standards* really be worried about that?
More from the RFC:
> [...] approaches like requiring a "Centralization Considerations" section in documents, gatekeeping publication on a centralization review, or committing significant resources to searching for centralization in protocols are unlikely to improve the Internet.
RFC, cotd:
> Similarly, refusing to standardize a protocol because it does not actively prevent all forms of centralization ignores the very limited power that standards efforts have to do so. Almost all existing Internet protocols -- including IP, TCP, HTTP, and DNS -- fail to prevent centralized applications from using them. While the imprimatur of the standards track is not without value, merely withholding it cannot prevent centralization.
RFC, cotd:
> Almost all existing Internet protocols -- including IP, TCP, HTTP, and DNS -- fail to prevent centralized applications from using them. While the imprimatur of the standards track is not without value, merely withholding it cannot prevent centralization.
RFC, cotd:
> Thus, discussions should be very focused and limited, and any proposals for decentralization should be detailed so their full effects can be evaluated.
Mark is not wrong that standards can't prevent centralization on their own! Mark's analysis of how many things end up re-centralizing is, overall, also largely correct!
However, I disagree in the present moment that standards orgs shouldn't be making decentralization concerns a *key priority*.
But Mark, to be fully fair, does examine several strategies, and their strengths and downfalls, of how we may enable decentralization.
However, the path that Mark most heavily leans into is "Enable Switching". Hm. Does that phrase sound familiar?
"Enable switching" from the RFC:
> The ability to switch between different function providers is a core mechanism to control centralization. If users are unable to switch, they cannot exercise choice or fully realize the value of their efforts because, for example, "learning to use a vendor's product takes time, and the skill may not be fully transferable to a competitor's product if there is inadequate standardization".
(cotd ...)
"Enable switching" cotd:
> Therefore, standards should have an explicit goal of facilitating users switching between implementations and deployments of the functions they define or enable.
Does this sound familiar? If so, it's because it's awfully close to "credible exit"!
As said, I think "credible exit" is a worthwhile goal. But it isn't participatory decentralization, on its own. The ability to *move away* is good, but what if your options are to choose between McDonalds and Burger King? Is that *sufficient*?
In particular, Mark is especially fair to highlight that email and XMPP are great examples of decentralized systems that either ended up centralizing in the case of email or failing to stay alive after the exit of a major player in terms of XMPP.
Mark's RFC has a lot of useful analysis. It does!
So I've given a lot of context for Mark's RFC: it's an RFC by a respected standards author who has a long history of participating in standards from major internet-based corporations. It worries a bit about centralization but overall downplays decentralization more than it plays it up IMO.
And this is important of course, because this is the RFC where the definition of "decentralization" being provided comes from!
Or wait, or is it? Oh right, the RFC cites another source for its definition!
It's time to examine Paul Baran's 1964 paper. The story is about to become more intense.
Except, like a 1990s sitcom, we're gonna cut to a break!
We'll be back... after
=== TEA BREAK 2: MY NOSE IS COLD ===
Alright I'm back from my tea break. But I have a confession for you.
I made hot chocolate instead.
But we are going to get into the second part of the unnecessarily thorough "decentralization" terminology deep dive I'm doing here in just a moment
Before we get into that it's also getting pretty late here and I have another confession to make to you, I was pretty hungry, so you know what I did? I stood in the kitchen and I ate hummus in the kitchen with a spoon over the sink
You have found Secret Goblin #2, judging me for my hummus shame 👿
When we last left off I was peeling back layers of the terminology onion and we have gotten to the inner layer (maybe it goes deeper, I guess terminology usually does but this is as far as we go)
It is time to examine "decentralization" in Baran 1964
Because I am being UNNECESSARILY thorough
So here is Paul Baran's "literally the most influential paper to affect networking systems ever" 1964 paper:
"On Distributed Communications: I. Introduction to Distributed Communication Networks" https://www.rand.org/pubs/research_memoranda/RM3420.html
It's good, it's amazing, it's INCREDIBLY visionary
So okay yeah it's very military-oriented but... but! The context for this paper is that Paul Baran is arguing for what eventually *becomes* networking as we know it. Baran says: let's use *cheap* equipment with *way less centralization that we've ever seen* and it'll be *better actually!*
And just imagine the *gall* of it: telling the *military* let alone the world oh you know how you love hierarchy? Well guess what, you know what's WAY better, something that's closer to cooperative anarchy, where there's a lot of cooperation lots of error-prone little guys
AND HE WAS RIGHT
Baran comes in with the math to back up his claims, a vision of how basically wifi and satellite and land lines and cable internet would all work together before we even *had* any internet stuff, shows how a packet would look, and says if you want to REALLY be tough, be... "distributed"
Hm, did you notice I said "distributed" and not "decentralized"?
Actually wait... does this sound familiar, have you heard of this paper before?
Could it be? No... it couldn't be...
And yes of course it is literally the paper that gives us this incredible FIGURE 1, which you have CERTAINLY seen if you have ever heard ANYONE talk about ANY "decentralized" or "distributed" system ever
CENTRALIZED DECENTRALIZED DISTRIBUTED
You know this image. You could never forget this image
One of the reasons you know this image is that everyone worth their salt who works on decentralized networks thinks about this image and puts it in their talks
But also so does this bro who has literally no idea about how tech works but thinks he does
So one way or another you're gonna see it
(tech bro courtesy https://www.threepanelsoul.com/comic/job-interviews)
That comic is from Three Panel Soul btw, and here's the link https://www.threepanelsoul.com/comic/job-interviews
All of Three Panel Soul is good, but the Tech Bro ones are my favorites https://www.threepanelsoul.com/comic/search/Tech%20Bro
I love Three Panel Soul so much
(Gonna weird out @3psboyd by fangirling over here)
*COUGH* where was I
"Christine if you love this paper so much why don't you like the definition of 'decentralized' from it?!"
The definition is great actually if you know the context
Because the context is CRITICIZING THE DESIGN UNDER THE DEFINITION AS A FORM OF CENTRALIZATION
"What Christine you can't mean that, why would 'decentralized' be 'centralized' that can't be true"
Because because BECAUSE my good friend, Baran was describing "decentralization", a term that ALREADY EXISTED in networking, as being a kind of centralized system
NO REALLY I AM SERIOUS
The term "decentralized" was *already* in active use! So Baran was providing "distributed" as the new term! Oh my god THAT'S WHY THE DEFINITION BARAN PROVIDED FOR DECENTRALIZATION WAS SO WEAK
You don't believe me? Let me show you. LET ME SHOW YOU
Here is where Baran defines "decentralization!" We have to read the whole definition!
You're not allowed to stop until we finish EVERY (cotd) let's GOOOO
> The centralized network is obviously vulnerable as destruction of a single central node destroys communication between the end station.
(cotd)
@cwebber It is the middle of the night and I am worried that reading this post in my head is going to wake my dog
Baran "decentralization" cotd:
> In practice, a mixture of star and mesh components is used to form communication networks.
IN PRACTICE FOR CENTRALIZED SYSTEMS YOU GUYS
(cotd)
Baran "decentralization" cotd:
> For example, type (b) in Fig. 1 shows the hierarchical structure of a set of stars connected in the form of a larger star with an additional link forming a loop.
OH SHIT HE'S STILL TALKING ABOUT CENTRALIZATION FIGURE B IS THE MIDDLE ONE
(cotd)
Baran "decentralization" cotd:
> Such a network is sometimes called a "decentralized" network, because complete reliance upon a single point is not always required.
OKAY WE'RE DONE
But look at it all together! He's talking about how "decentralization" is a term of art but it's still CENTRALIZED
Baran didn't make up the term "decentralized" it already was being used in practice to talk about top-down hierarchical systems! Baran calls this version centralized even if there's a "loop" (a small number of top-level providers)!
YOU GUYS THIS IS NOT HOW WE ARE USING "DECENTRALIZED"
WE are not describing the future of routing small packets in 1964, that is NOT the world we are existing in, where "decentralized" meant a top-down hierarchical structure
When WE talk about "decentralized", we mean roughly a spectrum, with "centralized" on one side and "decentralized" on the other
Now I don't think Bryan Newbold realized that when he pulled his definition from Mark Nottingham who pulled his definition from Paul Baran, that this was the case. I think this is a game of telephone.
(I don't know how Mark Nottingham didn't realize it but that's an aside)
What I DO know is that it means that the entire structure of analyzing decentralization in Mark's paper and Bryan's blogpost thus, in practice, surround a term that is weak because it was FUNDAMENTALLY describing a centralized system, so it could criticize it
The loss of context here is BRUTAL
To conflate the two *automatically* introduces decentralization-washing. I don't think this is intentional, but it explains a lot.
It explains how a "weak" definition of decentralization could come from one of the boldest visions of what that very *idea* could be
Now okay let's point out the irony here because I feel like if I don't I'm being mean. Bryan does say:
> To some degree, I don't really want to spend time in a terminology debate.
And I just did! At length!
But the whole debate this whole time is "is Bluesky decentralized" so we kinda HAVE to
But also what happened was:
- I lay out a strong definition of decentralization; Bluesky doesn't match
- Bryan suggests an alternate definition, pulls
from
- An RFC which despite the title is extremely lukewarm AT BEST about decentralization which pulls from
- A definition describing centralization
And I don't think this was malicious on Bryan's part in the least because I know Bryan well enough to know he's not like that!
I am pretty annoyed at Mark though for quoting this out of context in such a way that it can completely confuse a narrative like this. I'll assume that was a mistake but
The reality is that Bluesky didn't match my definition of decentralization, and I hope it's pretty clear now that the alternate definition supplied was literally one about centralization
And so that cannot possibly be a lower bar that we say "okay maybe Bluesky can pass this one" I am sorry
Let's PLEASE not move the goalposts on "decentralization". Let's certainly not move them back to something that was literally "here's what centralization looks like in practice".
That's what I'm asking for here. That's why I went so goddamned HARD on terminology here.
Let's check the time.
It's 7:30pm where I am. I woke up at 4:30am and resumed work on my blogpost at 5am.
I have been, for the most part, between the blogpost, my job, and this thread, sitting at my computer fighting for decentralization for about 14 hours. It's been like that a lot lately.
@cwebber found a goblin. Woohoo
I have a reputation at work of being good at pushing others to take off time and they HAVE to take off time OR ELSE and I try to be that way in general. But I am really truly bad at doing so for myself and I know I have crossed my limits for today.
So let's wrap up for *tonight* in a sec
We're about halfway through this blogpost. There's a lot going on in my life. I am trying so hard to keep the organization I work for alive and moving forward. I am tired. I need rest. And I still need to drive two hours across the state tonight.
We're going to resume tomorrow. But first...
There's a reason I'm going really hard on this. I really care a lot about the shape of the internet. And tomorrow we're going to get into some more analysis and a talk about *values*, and one thing I like is that Bryan talked at length about Bluesky's values. And I think that part was really good.
For tonight, I need to unwind, I need to put a label on a mailbox, I need to eat dinner, I need to drive across the state, I need to sleep.
Maybe I appear ridiculous. I get it. I go pretty hardcore on this stuff. If you know me you know I tend to go all in.
I am signing off for the night. Tomorrow we will analyze whether or not my assertion that "ATProto has explosive behavior as it approaches decentralization" problems.
I'm not going to read notifications until I finish this. Maybe someone will prove me wrong before I get it done.
I'll be oblivious.
We will also analyze values, which maybe I care about more than anything. And there will be more secret goblins, hidden among the posts.
For tonight, it's rest time. It's time for a
=== NO MORE LOOKING AT MY COMPUTER BREAK ===
@cwebber If ATproto describes its design (imho inaccurately) as “adversarial interop”…
Maybe ActivityPub could be described as: laissez-faire interop
@cwebber this is literally in [[decentralized]] in the Agora (the social knowledge graph I'm developing) but I didn't know where it was from, thanks for finding the source!
It should of course also be linked at [[distributed]] for completeness :)
@cwebber Also on how it could have ended up in implementation:
https://en.wikipedia.org/wiki/X.25
VC may be established using X.121 addresses. The X.121 address consists of a three-digit data country code (DCC) plus a network digit, together forming the four-digit data network identification code (DNIC), followed by the national terminal number (NTN) of at most ten digits.
Hello! I am back at my computer. Today we are going to talk about how ATProto does in terms of scaling. Yes, we know it scales up, and has done an impressive job of doing so!
But what about scaling towards decentralization? Does it scale down? And does it scale wide? Let's look.
Before we get deep into that, when we left last night I was extremely tired and had been working at my computer for over 14 hours. I then said I was going to drive two hours across the state that evening.
Thankfully thanks to the support of people who love me, I did not do that foolish thing!
So anyway, I am better rested, and also I woke up to the surprise that our fundraiser is doing a lot better, like by a lot, than it was yesterday, which is nice because I was extremely stressed out https://spritely.institute/donate/
So I am feeling much better and alive and today I remembered to eat lunch
But you probably aren't here to hear about my lunch choices or how much sleep I got or whether or not I forgot to bring my ADHD medication with me (I did so now I am drinking a bunch of caffeine instead), you are probably here to hear the rest of the analysis about decentralization and Bluesky etc
So let us get to it, let's talk about whether or not Bluesky can scale *down* in a meaningful way.
In my last essay I made assertions that this was important for decentralization and said ATProto wasn't great for this, and this was one thing people challenged me on
So let's take a look!
When I say "scale down", what I generally mean is "small instances can generally participate on the network". (We'll talk about "scale wide" later.) But another useful possibility which has come up is "can you make a smaller, more isolated use-case and use the same protocol for it"
This latter version of scale down does come up in Bryan's article:
> A specific form of scale-down which is an important design goal is that folks building new applications (new Lexicons) can "start small", with server needs proportional to the size of their sub-network.
(cotd)
Strictly speaking, I agree, ATProto can scale down in this use case! For example, if you wanted to make a small specialized forum for collaborative storytelling, you could use ATProto for it, and that's true, you could do it
But is it the right choice?
In some ways we are talking about two different things here: extension of functionality (which you might want the same scale for) and having a smaller and more isolated community
But regardless
ATproto positions itself *specifically* as designed for not wanting to miss messages, and I talked previously about how ATProto's design requires a god's-eye view.
It's a bit strange of a choice when you say "let's run a smaller community"
Given that message passing systems handle small scale systems *beautifully*, and *still* allow for interactions with larger scale systems, it's a bit confusing to me *why* you'd choose ATProto for such use cases. What is the specific benefit you'd gain? Especially because it's actually lossier here
At any rate, there's a bit of conflation here. "It scales down" by saying "you can have an isolated community/use case that's oblivious to the rest of the system" is categorically distinct from "it scales down" in terms of "a small node can meaningfully participate with the larger system"
At any rate, the problem with "scaling down" is much clearer when it comes to the problem of "scaling wide".
Or let me put it a different way: ATProto *explodes in complexity* when you try to scale it towards meaningful decentralization
Yes that's right we're getting to the spicy part of this conversation. We did the warm-up, now it's time to talk about the real thing, whether or not decentralization in the way I believe people *think* that term means is reasonably possible with ATProto as it's currently designed
But before we do that, I need to stretch and run to the bathroom
So for those of you following along, if you found this, Secret Goblin #3, let me know: "👺"
Oops wait actually we gotta talk about that one for a sec there's a reason I left it in scare quotes
Why on earth is the textual descriptor for Unicode U+1F47A "JAPANESE GOBLIN", does anyone know?
It's a Tengu, right?
Despite being the only actually named "goblin" emoji, I feel awkward about this one because is it correct to call it a "JAPANESE GOBLIN" instead of just "TENGU"?!?!
I don't know!
If you have knowledge or OPINIONS about "👺", its name choice in unicode, or, for that matter, a white person just dropping it in the middle of a group chat WITHOUT putting it in quotes (I did tho), feel free to derail the comment thread
Otherwise it's time for a
=== STRETCH BREAK ===
I'm back. It's time to talk about it: does Bluesky/ATProto suffer a "quadratic explosion" as we move from centralization towards *meaningful* decentralization?
I claimed it did, but I was challenged on this. What did I mean? Am I right or wrong?
It's time to find out!
In the previous blogpost I said the following:
> If this sounds infeasible to do in our metaphorical domestic environment, that's because it is. A world of full self-hosting is not possible with Bluesky.
(cotd)
Decentralized ATProto is quadratic quote, cotd:
> In fact, it is worse than the storage requirements, because the message delivery requirements become quadratic at the scale of full decentralization: to send a message to one user is to send a message to all. Rather than writing one letter, a copy of that letter must be made and delivered to every person on earth.
This was probably the thing I got the hardest pushback on from a team member of Bluesky, that it is not quadratic as we scale towards decentralization.
Truth be told, I don't have a degree in CS. Most of what I know I learned from studying independently and community resources. Was I wrong?
Just as a quick aside, regarding that comment about "agency", maximizing the agency of everyone (and more importantly, minimizing subjection!) sits at the heart of my ethical framework https://fossandcrafts.org/episodes/11-an-ethics-of-agency.html
So I don't disagree on that part, but that's an aside!
Now, I said I won't read replies until I am done summarizing things, and that's true, so maybe someone has gone out of their way and proven that I am wrong, that the claims in my article are factually incorrect and so on and so forth. I wouldn't know yet.
But... I don't think I'm wrong.
As said I'm very self-conscious about these things because I *don't* have formal CS training. But I do a lot of research and so I've tried to become knowledgeable about these things and this *seemed* like the correct analysis to me
Because of that, I turned to people who actually knew more than me
For one thing I derailed the entire Spritely morning standup by walking everyone through the scenario. I gave the story example, which I'll detail later.
But @dthompson didn't find the story helpful, too much narrative detail. "I need to work through this example independently." So he did.
@flockofbirbs.bsky.social came back and laid it out in more formal terms and said I was right.
But I was still nervous, so I called up one of my old MIT AI Lab type friends and rambled about it to them on a call. What did they think?
"I think it's pretty clear immediately that it's quadratic. This is basic engineering considerations, the first thing you do when you start designing a system," they said.
Well that's a relief, why isn't it clear to everyone else, I asked?
So they suggested I lay it out to you as I did to them.
Let's start with the following:
- ATProto has positioned itself as "no compromises on centralized use cases". Well, in that case, let's say it can't do *worse* than eg ActivityPub. This includes with replies. You can't do *worse* than ActivityPub on replies and mentioning someone, etc.
- We will interpret the most centralized system as one where there's only one provider for storage and distribution of all messages: the least amount of user participation
- The flip side of the spectrum of maximum decentralization is the *most* amount of participation: every user self-hosts.
- Just as blogging is decentralized but Google (and Google Reader) are not, it is not enough to have just PDS'es in Bluesky be self-hosted. When we say self-hosted, we really mean self-hosted: users are participating in the distribution of their content.
- We will consider this a gradient. We can analyze the system from the greatest extreme of centralization which can "scale towards" the greatest degree of decentralization.
- Finally, we will analyze both in terms of the load of a single participant on the network but also in terms of the amount of network traffic as a whole.
So okay. Let's get the CS notation out of the way:
"Message passing" at full decentralization:
- O(1) from a single node's perspective
- O(n) from a whole-network zoom-out perspective (inherent: add a user, it's one more user)
Okay, that's reasonable and what you'd expect
"Public global no-missed-messages (or not worse than AP) shared-heap" ATProto style at full decentralization:
- O(n) from a single user's perspective (!)
- O(n^2) from a whole-network perspective (!!!!!!)
Oof I'd better back this up because that ain't good!
In other words, as our systems get more decentralized, message passing handles things fine. Individual nodes can participate in the network no matter how big it gets. The zoom-out for the network as a whole doesn't get more complicated as we add more users OR move more users towards self hosting.
Things are NOT good, if I'm correct above, as we make things more decentralized in the atproto-public-shared-heap model. The more self-hosting and indeed the more "full nodes" join, the more it gets expensive for each of the nodes and the network EXPLODES!
Truly self-hosted atproto is NOT POSSIBLE!
And there is no solution to this without adding directed message passing. Another way to say this is: to fix a system like ATProto to allow for self-hosting, you have to ultimately fundamentally change it to be a lot more like a system like ActivityPub!
Now I left more of the precise analytical explanation in my blogpost. But social media isn't great for that, so go check out my blogpost if you want to go through all that (eg if you're more like @dthompson and less like me, I'm a narrative person) https://dustycloud.org/blog/re-re-bluesky-decentralization/
Here's our story:
- We have 26 users: [Alice, Bob, Carol, ... Zack].
- Each user sends one message per day, which is intended to have one recipient. (This may sound unrealistic, but it's fine for modeling.)
- Each user sends a message in a ring: Alice => Bob, Bob => Carol, ... Zack => Alice
Now just before you say "wait but ATProto isn't for DMs", yes, but one way this could happen is that eg Bob follows Alice, Carol follows Bob, etc.
What I'm saying is, messages can have an "intended audience". That's what we're using here.
Before we get into this, remember, the main difference between "message passing" and the "shared heap" is the former has directed and delivered messages, the latter does not. See prev blogpost for explainer.
So, what happens in a day for both systems? Because that's what we really want to find out.
Under message passing, Alice sends her message to Bob. Only Bob need *receive* the message. So on and so forth.
- For an individual self-hosted node, messages passed per day: 1.
- Per the decentralized network, total messages passed zooming out: 26.
That's about what we'd expect.
Under the public-gods-eye-view-shared-heap model, each user must know of all messages to know what may be relevant. Each user must *receive* all messages.
- Individual self-hosted server, 26 messages must be received per day.
- Zoom out on whole decentralized network: 26*26: 676!
Sounds survivable with 26 users though, right?
Let's try just adding 5 more users.
Message passing:
- Per node per day: no change.
- Per the network: 5 more messages.
Public gods-eye-view-shared-heap-model:
- Per node per day: 5 more per day
- Per network: ((31 * 31) - (26 * 26)): 285!
Now, could we handle a million self hosted users? Is it possible? No problem in message passing. EXPLOSIVE with atproto.
What if we had a million users and added just 5 more? How many more messages must the network bear?
5 new messages in message passing.
*10,000,025* new messages sent in atproto!
"Christine that's ridiculous, we're not expecting a million self-hosted users"
Well I think it would be nice!
But regardless, ActivityPub has 27,000 servers on it, all meaningfully participating in the network.
ATProto, in its current design, would be crushed to DEATH
"But Christine", you may say, "I heard gossip might fix this!"
No. It cannot.
In fact, I was being more generous than a gossip network, and assumed you only *received* a message once.
With gossip you might *receive* more than once.
But you need to receive a message to know it.
ATProto was designed for a "big world" view. That's fine! But I'm trying to show seriously what happens if it was actually, really decentralized.
*Every* fully participating node added to the network makes the network explosively more expensive.
ATProto doesn't scale towards decentralization.
In other words, the public god's-eye-view allows for a pantheon, but not a civilization. You can only have so many gods who see all.
An important characteristic of a decentralized system is scoping what you *don't* need to know.
This wasn't in the design goals of ATProto, and it has effects.
I may be coming across as some academic computer science nerd. It's actually the opposite. I'm a humanities nerd who cares about the agency of users so much I've twisted myself into a shape where I can do a computer science thing.
But architecture matters. It affects the worlds we can have.
This is what I say when I say that Bluesky's goals of "credible exit" may be reasonable, but it's not decentralized. There is no getting around the fact that the system, as designed, is designed for a few large players. Small players can play on the *periphery*, but they can't play the big game.
Now, you might think, maybe ATProto could fix this!
And it can.
And the solution, ultimately, will end up looking... a lot like ActivityPub.
The point is that nearly everyone knows at this point that "sure, Bluesky is decentralized today, in practice!" But a lot of the responses I see are "but decentralization is just around the corner thanks to ATProto!"
So that's why I'm writing this out.
Well, that's it. We've reached as far as we're going tonight.
There's still a bit left, a bit of reframing about what I am and am not concerned about with decentralized identity, and then a bigger topic about Bluesky's design goals vs community expectations. Then we'll talk talk about values.
Those last two, expectations and values, are really important to me. And I think they'll maybe be the most thoughtful part of all of this.
Of course, they're probably not what most people care about from me, about this. Probably what I've said is all many care to hear from me and that's fine.
For those who care about such things, tune in tomorrow, where hopefully we'll wrap this up. For those who were just hoping to hear the decentralization analysis, hope you found it useful.
Regardless, I wish you a very happy
=== REST OF TODAY BREAK ===
@cwebber We need symmetry and Big Tech will never ever give it to us even if they promise.
Well hello.
So yesterday I stepped onto a crumbled piece of sidewalk, twisted and sprained my ankle, and fucked up my wrist. That, and I think I've said the most important things and this is day *three* of summarizing things from my blogpost, so I will be brief.
Sadly, I'm stopping where things take a positive turn in my article: talking about values, which I thought was a nice part of both @bnewbold and my articles.
I think the values/design goals Bryan did lay out are nice, and I talk more about ActivityPub and @spritely's values.
It was nice to be prompted about @spritely's values and it lead to a good conversation internally, and we did capture those in my blogpost, but I think that should be covered again from a more official organizational side, separate from this.
I also clarified a bit: the parts I'm concerned about with the did:plc stuff aren't as much the governance, and I think Bluesky is taking some good steps there by planning a certificate transparency log. That's good. Glad to see it.
I do think Bluesky is heading in a tough direction though in terms of community expectations vs the ATProto philosophy that replication and indexing of a firehose are the primary way things work.
It's a tough situation but Bluesky is speedrunning Twitter so fast it practically is Twitter.
People want Bluesky's devs to prevent their content from being replicated and indexed by people they don't like, well, I think it really is that: a *conflict*.
People were encouraged to join a Twitter replacement, they are expecting Twitter-like solutions. Can't blame 'em.
Given that "anyone can replicate and index!" is literally the *entire* design philosophy of ATProto, it's not going to be something easy to solve. I don't have an answer, but hey, I'm working on fairly fundamentally different designs, so it's not my problem to solve.
That said...
Like the present-day fediverse, Bluesky was majorly popularized by a bunch of queer people early on. As a trans person I watched a bunch of my friend join and felt so safe they posted things they never would have in today's environment when the community was small.
The decision about whether or not to boot horrible, well known transphobic people (protip: answer is yes) from the platform seems clear enough to me. I'm not sure the "speech vs reach" approach is working.
And it seems to me people are finding they don't have tools in their hands to do anything.
For all its faults, and there are *many* and I have *railed* against the instance-oriented approach to moderation on the fediverse and have been writing about and working towards alternatives for a while, instance moderation empowers better here.
I think this will be a real test for Bluesky.
But more broadly I think *neither* the present-day fediverse nor Bluesky meet the needs of the future.
The "global town square" is a social media concept invented by centralized social media in the early web 2.0 era.
Social media by millenials, for millenials. What's the future?
So to some degree, I don't have a lot of interest in trying to figure out what the solution to this is, because I think these are the wrong designs. I don't like the context-collapse firehose much at all, I'm interested in "contextual communication", "secure collaboration", and "healthy communities"
That's the kind of direction we're trying to build towards with @spritely, but as said, I'm dropping the values discussion here, that's something we'll talk about later in the week. I would like to talk about that independently, focusing there on what to build, not on a critique.
@cwebber I agree with this as first-principles, but I do note that every contextual/healthy/secure community that I'm a part of spends a pretty solid amount of time sharing and discussing stuff from the context-collapse firehoses (for many appropriate contextual reasons - laughter, learning, safety, etc.) I am not sure how we square those facts.
But I do think there's a big collision course ahead, and I don't know how it'll resolve. Investors and users who want quick resolution to real concerns on one side, a vision for public, highly replicated and indexed by anyone content on the other side.
It'll be a challenge.
There's opportunities for collaboration maybe. I've asserted pretty strongly that Bluesky isn't decentralized and as a system, it isn't. You can't tear the power dynamics out of the analysis. Otherwise what's the point?
But Bluesky uses decentralization techniques, there may be collab space there.
I'm not trying to be a mean, horrible person to Bluesky's devs. I'm really not. I actually think that they've provided something much *better* than X-Twitter to a lot of people.
But Bluesky has speedrun this whole thing so fast, Bluesky is already no longer the underdog. It's Twitter TNG.
And that means we can't pretend that decentralization is something that's some future possibility or goal, that it's gonna happen some day we promise.
I'd love to be proven wrong on everything I laid out.
Though I think the only way to do that without being worse than AP is serious rearchitecting.
Who's empowered and who has agency and how we can increase the agency of everyone is indeed, all I care about. It's what "decentralization" means to me and matters to me as a goal. You can't drop the power dynamics. It *is* about the power dynamics.
I want us to build a better future. A real one.
One thing I am confident about: it's not that Bluesky's engineering team doesn't care.
Actually I only really know two of Bluesky's main people well, Jay Graber and Bryan Newbold.
I know they do both care.
But so did Twitter's early devs. Twitter was supposed to be decentralized too.
It's easy to forget that Blaine Cook led a team at Twitter in early days to make Twitter decentralized and the team there was worried about the effects centralization can have.
Investors killed it anyway.
It has to be more than about caring, the work has to happen and be preserved.
I've said enough. I've said more than enough. I've said more than people probably thought could possibly happen on the subject on a blogpost or social media thread let alone *two*.
And that's with me dropping part of the second blogpost because I fell and hurt my hand.
It's time to wrap up.
I hope I haven't caused emotional strain on anyone. I spent a while walking back from brunch and was pretty depressed and was talking with my girlfriend: was I just *mean* about this whole thing?
She reassured me she didn't think I was, but I still feel like I was mean.
I tried not to be.
But despite there being literally millions of people on both Bluesky and the fediverse, I haven't seen any other analysis that went comprehensively into architecture, terminology, and their implications at the level I did in terms of their *implications* and *impact*.
I think it needed to be done.
So one more post after this one. Just one more post.
I have said so much, I feel like I am pumping the brakes on train of analysis and it's taking a while to come to a halt but it's time. I want to wrap it up, for everyone reading this, for myself.
So here we go.
We should build decentralized systems because we care about empowering people. We can't forget about power distribution.
Let's be clear about what our systems can and can't do.
And no matter where you are, if you're trying to build a healthier internet for everyone, keep it up.
Thanks. 💜
@viq it likes pets
@cwebber hey, First Secret Goblin popped up! Does it want to be petted, or does it want to bite me in the shin and steal my lunch?
@cwebber @spritely Fighter planes don’t even NEED two wings. Why isn’t the war insurance senate committee denying these unnecessary claims?
https://www.sandboxx.us/news/that-time-an-f-15-landed-without-a-wing/
@cwebber Second Secret Goblin doesn't judge, only nudges you to share hummus
@cwebber I thought the comparison to libera.chat was especially interesting because they've got one "credible exit" behind them! Back when we were all on freenode, but had to suddenly move to libera to deal with the hostile takeover. Freenode was centralized but the network was able to recreate itself in a way I haven't really seen elsewhere because IRC clients could point to a different server and carry on.
@trwnh it's the best hummus
@cwebber was it at least good hummus
@gkrnours it's better than nothing if you don't have your meds
@cwebber wait, caffeine is an alternative to ADHD medication?
@cwebber one thing I am surprised no one has mentioned.. the very philosophy of a gods eye view is inherently a centralizing one ?
@josef Unfortunately every.org doesn't support donations less than $10, but you could make a one-time donation!
@cwebber I'd like to make a smaller than $10/mo regular contribution - where can I do that please? Thanks!
@josef Yep oops, fixed
@cwebber I think you meant to write "centralized today, in practice" here, right?
@slothrop It's confusing and in general, people don't do it. What people tend to speak about *is* a power dynamic spectrum.
Baran was dealing with the situation where the term was being used in a specific technical way *at the time he wrote the document*, so he had to introduce another term.
When *most people* today talk about decentralization, they mean the spectrum. We *at least* have to be clear about definitions if you mean something else.
🙂 @cwebber I just made a one-time $25 donation towards @spritely in recognition of your important work!
I'd encourage others to support too! You can do so here: https://spritely.institute/donate/
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.