On Thu, Feb 01, 2024 at 12:10:11PM +0000, Hans Meiser wrote:
is there any current discussion about moving Git development away
from using
a mailing list to some modern form of collaboration?
I'd like to be able to follow a structured discussion in issues and
to
contribute to the Git documentation, but the mailing list currently
just
bloats my personal inbox with loads of uninteresting e-mails in an
unstructured waterfall of messy discussion that I am not able to
follow
professionally.
Here's a perspective from the world of Linux kernel, where this
discussion is
continuously raging. Funny enough, the main objection a lot of kernel
maintainers have to forges is that it makes it really hard to find
relevant
discussions once the volume goes above a certain threshold. These
folks have
become *extremely* efficient at querying and filtering the mailing
list
traffic, to the point where all they ever see are just those
discussions
relevant to their work. They love the fact that it all arrives into
the same
place (their inbox) without having to go and click on various
websites, each
with their own login information, UI, and preferred workflow.
The kernel maintainers are able to review tens of thousands of patches
monthly
with only about a hundred or so top maintainers. To them, this system
is
working great, especially now that some tools allow easy ways to
query,
retrieve, verify, and apply patches (shameless plug for lore, lei, and
b4
here).
The obvious problem, of course, is that these folks are FOSS's
"marathon
runners" who got really good at their workflow, but the situation is
different
for anyone else who is just starting out. Any new kernel maintainer
stepping
up obviously finds this overwhelming, because they aren't yet so good
at
filtering the huge volume of the mailing list traffic and to them it's
just a
torrent of mostly irrelevant patches.
Are you consideration for migrating?
Yes, of course, this is constantly under consideration. There isn't
some sort
of anti-forge cabal that is preventing things from going forward, but
there
are some serious hurdles and considerations to consider:
- How to avoid a vendor lock-in? Those of us who have been around for
a while
have seen forges bloom, and then shrink into irrelevance (e.g.
bitkeeper)
or slowly ensh*ttify to the point of unusability (sourceforge).
GitHub is a
proprietary service owned by a single company who are currently
FOSS-friendly, but have certainly been extremely FOSS-hostile in the
past.
GitLab is open-core, and the current record for open-core projects
isn't
very encouraging (Puppet open-cored themselves into irrelevance,
Terraform
has gone full-proprietary, among most recent examples). Full-FOSS
alternatives exist, but people aren't really that enthused about
using
less-popular solutions like Forgejo, because they hate unfamiliar
UIs almost
as much, or even more than they hate unfiltered mailing lists.
- How to avoid centralization and single points of failure? If Linux
or Git
move to a self-hosted forge, how do we ensure that an adversary
can't stop
all development on a project by knocking it offline for weeks? This
has
literally just happened to Sourcehut and Codeberg -- and as far as
anyone
can tell, the attacker was just bored and knocked them out just
because they
could. Yes, you can knock out vger, but this will only impact the
mailing
list -- people can still send around patches and hold discussions by
temporarily moving to alternative hosts. With the distributed nature
of the
mailing list archives, this can even be largely transparent to
anyone using
lei-queries.
- How to avoid alienating these hundreds of key maintainers who are
now
extremely proficient at their query-based workflows? We're talking
about an
extremely finely-tuned engine that is performing remarkably well --
we don't
want to disrupt development for months just to try things out with a
forge
and find that it isn't working out.
Finally, there's also the consideration of current trends. One upside
of "AI"
(LLM, really) technologies is that they are extremely good at taking
in a huge
source of data and finding relevant information based on natural
language
queries. I can very easily see a mechanism spring up in the next year
or less
where you can issue a query like "send me any threads about reftables
or
promissory remotes if they contain follow-ups from Junio" and
reasonaly expect
this to work and work great -- all while keeping things decentralized
in
addition to distributed.
Above all, this isn't a "forges are terrible and shouldn't be used"
response
-- they are clearly useful, especially when it comes to CI
integrations. A
large part of my work is bridging forges with mailing lists and
vice-versa,
which I hope I'll be able to do in the near future (GitGitGadget
already does
it with GitHub, but my goal is to have a pluggable multi-forge
solution). I
just wanted to highlight the aspects that aren't necessarily obvious
or
visible from the outside.