Re: On community influencing (was Re: [PATCH v8 2/2] rust: add dma coherent allocator abstraction.)

Hector Martin <marcan@xxxxxxxxx> · Sat, 8 Feb 2025 04:18:44 +0900

On 2025/02/08 3:33, Linus Torvalds wrote:
> On Fri, 7 Feb 2025 at 10:02, Hector Martin <marcan@xxxxxxxxx> wrote:
>>
>> Meanwhile, for better or worse, much of Linux infra *is* centralized -
>> for example, the mailing lists themselves, and a lot of the Git hosting.
> 
> The mailing lists are mostly on kernel.org, but the git hosting most
> certainly is not centralized in any way.
> 
> The kernel.org git repositories used to be special in that I didn't
> require signed tags for them, because I trusted the user maintenance.
> But I was encouraging signed tags even back then, and once it got to
> the point where most were signed anyway, I just made it a rule. So now
> kernel.org isn't special even in that respect.
> 
> Now, kernel.org is very much _convenient_. And you see that in the
> stats: of my pulls in the last year, 85% have been from kernel.org.
> But that is very much because it is convenient, not because it's
> centralized.
> 
> But that still leaves the 15% that aren't kernel.org.

For all intents and purposes, 85% centralized might as well be fully
centralized. That is, any downtime on kernel.org will affect the
community effectively the same as downtime on a true central SPOF would.

> More importtantly, not being centralized was very much a basic tenet
> of git, so *if* git.kernel.org were to become problematic, it's very
> easy to move git repositories anywhere else. Very much by design.

And this is why, over focus on *decentraliation*, I think we should be
focusing on *recoverability* and *data availability*.

There are two distinct scenarios. If kernel.org goes down for a while,
that screws with people's workflow. They can find alternatives, but it
will cause immediate, acute disruption, even while using "decentralized"
git. You still need to tell people where the new repos are, reconfigure
remotes, etc.. If the disruption is expected to be short-lived, a lot of
people will choose to just ride it out, rather than find alternatives,
or will only use alternatives in an ad-hoc manner where strictly
necessary. Nobody does a "true" complete migration to another Git
hosting when faced with a brief, temporary outage (heck, even just the
time it takes to push a Linux Git tree from scratch to somewhere new is
much longer than ideal, more so if your internet connection isn't the
best or the routing to the server in question isn't great). At most you
push whatever you're working on as needed somewhere else.

If kernel.org goes down permanently or long term (or becomes
problematic), that's when you start doing a full migration. And thanks
to the Git design, this is possible - not because it's decentralized per
se, but because everyone has local copies of almost everything, and they
can easily be restored onto a new service.

The same goes for a forge scenario. If it goes down briefly, you fix it,
and you can still use email to communicate with people ad hoc and send
git changes around through any other hosting. It's still Git at the end
of the day. Emailed patches and Git trees aren't going to become
unusable just because most or all of the regular development happens on
a forge.

And if it goes down permanently or long term, or becomes problematic,
what really matters isn't that it's "decentralized" as in federation.
It's that the data is available, in much the same way it is for
lore.kernel.org, and that people are actively backing it up (and it
isn't hard to arrange for a handful of folks/organizations to have up to
date mirrors, you don't need a Git-like scenario where literally
everyone has full repo dumps; heck, I have a big NAS at home, I'd
volunteer myself). Then, the service can be restored elsewhere when
needed. If the issue is with the specific forge software used, you can
always develop a migration plan to a different software (which, if the
risk is managed properly, shouldn't be a time-critical process anyway).

- Hector