Redecentralization

A thought experiment! Let's reimagine the web with r-selected thinking.

Jul 31, 2022

Update 2022-08-31: We’re building the architecture sketched out in this post. Check it out: Noosphere, a protocol for thought.

A few more thoughts on how decentralization enables permissionless innovation, and what it might mean for today’s web.

So, Paul Baran reimagined networks for survivability. Packet switching made the network survivable by making it decentralized, redundant, diverse, adaptable. These qualities unlocked permissionless innovation and unleashed powerful commercial breakthroughs, which then recentralized the internet one layer up, around apps. My takeaway:

Perhaps there is a rule of thumb here? If you decentralize, the system will recentralize, but one layer up. Something new will be enabled by decentralization. That sounds like evolution through layering, like upward-spiraling complexity. That sounds like progress to me.

Redecentralizing the web?

Here’s a thought experiment in the spirit of “I want a pony”. What if we reimagined the web from Baran’s point of view? Can we imagine a new weblike thing that is to the web as packet switching is to circuit switching? In what ways might we design for decentralization, redundancy, diversity, adaptability? What might that unlock?

First, where is the web lacking these qualities today? In general, the web has recentralized around:

Trust: security, and thus identity and payments, are all bound to a given domain. All three have powerful centralizing effects.
Data: most stuff is trapped in silos by domain without credible exit.
Infrastructure: the actual server farms and cables the internet runs on. Economies of scale have powerful centralizing effects.
Attention: “A wealth of information creates a poverty of attention, and a need to allocate that attention efficiently.” (Herb Simon). This fundamental scarcity implies search, discovery, ads, spam, and a bunch of other thorny things.
Domain names: human-meaningful names are scarce and Zooko’s Triangle means we typically need a centralized authority (ICANN) to negotiate this scarcity. DNS infra is also centralized and vulnerable to censorship.

If you ask me, the domain name system is already pretty great. ICANN is a nonprofit, so power is decentralized. Fantastic! However, the web binds trust, data and infrastructure to domains through the same-origin security policy.

As a result, attention gets driven toward domains, which control trust, data, and infrastructure. The end result is aggregators, lock-in, and a landscape that is fundamentally feudal.

Gordon Brander @gordonbrander

You can’t have a distributed web when security, privacy, identity, data, and scripting all belong to the origin. Network effect makes aggregation inevitable, but same-origin makes aggregation mandatory.

Gordon Brander @gordonbrander

@pfrazee Yeah we’ve conceptualized network security as castle-wall, all the way down to same-origin security model. Given that, it’s inevitable that the web trend toward feudalism, and privacy, safety, and security all push us toward centralization. Time for a radical reimagining!

You might say the same-origin policy makes websites K-selected. Everything is centralized around origin, so the origin must be protected. This is similar to the central importance of switching stations, before packet switching made routers fungible.

So what might an r-selected web look like? One that decouples data, trust, infrastructure, and names? Let’s imagine…

Content-addressing decouples data from origin

It is generally recognized that the current approach of using IP address both as a locator and as an identifier was a poor design choice.
(Clark, 2018. Designing an Internet)

Content addressing is the big little idea behind IPFS. With content addressing (CIDs), you ask for a file using a hash of its contents. It doesn't matter where the file lives. Anyone in the network can serve that content. This is analogous to the leap Baran made from circuit switching to packet switching. Servers become fungible, going from K-selected to r-selected.

CIDs also give us resilience through redundancy. On the web, you request a resource from a domain. If the domain stops hosting the resource, your link is dead. With content addressing, many copies of a file can be sprinkled throughout the network. Lots of copies keeps stuff safe.

A challenge to flag: the degree of resilience you get from this long tail will depend upon the diversity of the content in the long tail. The truth is, without incentives to push toward diverse “seed bank" caching, you’re mostly going to keep power-law popular content alive. Incentives will be important here.

John Backus @backus

BitTorrent is a well-tuned game-theoretic system during download and a tragedy of the commons after. Private trackers turn the commons into a poorly tuned centrally planed economy (big improvement). Let's reframe private trackers in economic terms and tune the system! Thread 👇

Still, I think this is better than the same-origin status quo. You hear these stories about lost artifacts discovered in someone’s attic and brought back to the spotlight. This tells me it takes only one. Popular information will be endemic, but even endangered species may recover when they can escape their little island.

But for me, the real magic of content addressing is credible exit. If everything is CIDs, you can take your data with you. It’s not trapped by SaaS silos.

UCAN decouples trust from origin

So, we decouple content from domains, but now we have a trust problem. The same-origin security model anchors trust to domains. We’ll need another way.

What if we cryptographically signed everything that got published to the network? Now we don’t have to care about origin. Instead we can verify the signature of content.

UCAN (User Controlled Authorization Network) offers a promising primitive for authorizing users without a backend. Even better, UCANs are self-sovereign. You own and control your keys, not some app. We love UCANs enough that we’re building rs-ucan, an open-source Rust implementation.

But now we have a lot of keys to manage. Where are we going to put them?

Cryptography is a tool for turning lots of different problems into key management problems. (Dr. Lea Kissner, Head of Privacy Eng and CISO at Twitter)

My sense is that this is a problem we should be solving anyway. Keys not IDs! Keys allow for privacy, illegibility, and smooth passwordless sign-in. I think the wallet app paradigm is the way to go. Apple thinks so too.

Petnames decouple names from origin

That leaves names. DNS is already great, but hey I’m asking for a pony, so let’s explore this one as well.

By moving data to CIDs and trust to keys, we’ve designed a system with lots of meaningless numbers. How might we give these meaningful names? How about a petname system?

A petname system works like the address book on your phone. Instead of a global name system, you keep a personal address book that maps meaningful names to secure-decentralized addresses.

You can also share address books. Peers who trust each other can exchange cryptographic keys, and securely share petnames, building up webs of trust. Petnames cheat Zooko’s Triangle.

So perhaps we could use petnames to map names to keys and slugs to CIDs. Let’s imagine two address books. One that maps your names to keys,

@gordon → did:key:z6MkhaXgBZD...
@cdata  → did:key:2QtKLGpbnnE...
@sara   → did:key:6bxIl29nbwo...
...

and another that maps slugs to CIDs:

/cat-thoughts → bafybeigdyrzt...
/patterns     → bafyf3efuylqa...
/tfts         → bafygdyrzefu3...
...

Together, they amount to a self-sovereign social graph, and a personal backpack for all of your data. Both can be gossiped between peers any time a key or CID changes. You could even address petnames through some key to get a name according to someone:

@gordon/cat-thoughts
@cdata/cat-thoughts

This is a lot like DNS, except that instead of origins tied to infrastructure, we have keys which are gossiped. Infrastructure becomes fungible.

One thing to keep in mind about petname systems is that they can be tricky to bootstrap. Where do I find my first friend? But there are pragmatic solutions. You can publish public address books to an ordinary website. This is what Scuttlebutt does. Maybe you could mint address books to a blockchain? Either way, it’s doable.

Where does it recentralize?

What if we managed to pull all this off, decoupling trust, data, infrastructure and names? Where would this leave us?

Well, infrastructure economies of scale and attention scarcity will not go away. These are fundamental laws of the universe, and strong power laws around them are inevitable. It might even be that aggregation is amplified by decentralization, since aggregation is a function of demand, not supply.

The more liquid a market the more power accrued to an Aggregator. I have made similar observations over the years about Google in particular: the biggest Aggregator of them all has the least amount of lock-in — perhaps those two points are related!
(Stratechery, 2022. “Aggregation Follow-up, Netflix’s Ad Partners“)

Aggregators, yes, but aggregators without lock-in! This seems like a strictly better situation. Credible exit and a self-sovereign social graph is not nothing. If an aggregator abuses its position, my attention can shift elsewhere, and I can take my friends and my data with me. Aggregators might come out ahead, but so do users.

Where is aggregation or centralization likely to happen? Anywhere abundance creates new scarcities. Keys are free, so centralization is likely to happen around Sybil-resistance, identity, reputation. Information is free, so centralization is likely to happen around spam, moderation, search, ads, discovery. Some of these may have decentralized solutions, some not. There are risks here. None of them are new.

My best guess is that such a system might result in faster emergence of big aggregators and also faster collapse. A hotter innovation loop. New aggregators could rapidly emerge and compete, without having to contend with incumbent network effect.

Another possibility that this architecture leaves open is for friend-to-friend networking. Petnames are decentralized, and CIDs too. You could exit IPFS’ global DHT and have something like Secure Scuttlebutt. A gossip-based network.

A common situation is that the power law is obeyed… for large values of k, but not for the small-k regime. (Newman, 2018. “Networks”)

This kind of gossip network would be perfect for Dunbar-scale community. Since scale-free network power laws break down at the cozy end of the long tail, you could imagine a cozyweb that resists aggregation through illegibility. Small spaces have room for mutuality.

Overall, I think I see the outline of something protopian, rather than utopian. Better than yesterday. Such an architecture might give users ownership of data and identity, and un-break the evolutionary loop of technology. No doubt it would also raise new challenges.

If you decentralize, the system will recentralize, but one layer up. Something new will be enabled by decentralization. That sounds like evolution through layering, like upward-spiraling complexity. That sounds like progress to me.

Credit @cdata who expanded our early conversations about mapping DNS to CIDs into this working hypothesis, layering in self-sovereign signing and petnames.