Re: Alternates advertisement on GitHub

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stolee,

On Fri, Jul 26, 2019 at 09:18:50AM -0400, Derrick Stolee wrote:
> On 7/25/2019 11:18 PM, Taylor Blau wrote:
> > Hi everybody,
> >
> > Pushes to forks of git.git hosted on GitHub now advertise the tips of
> > git.git as well as branches from your fork.
> >
> > You may recall that Peff and I have sent a handful of patches to allow
> > repositories to customize how they gather references to advertise from
> > an alternate, and then to use those alternate tips as part of the
> > connectivity check (in [1] and [2], respectively).
>
> I'm glad to hear you deployed this so quickly after review!

Thanks :-). There was a good chunk of additional work having to do with
how we replica repositories at the storage layer, but it didn't have
much to do with upstream git (which is why I avoided mentioning it in my
original email).

> > GitHub used to advertise '.have's on pushes to forked repositories, but
> > hasn't done so since 2012. We aggregate data from all forks into a
> > 'network.git', and expose the tips of each fork as:
> >
> >   refs/remotes/<fork-id>/<refname>
> >
> > Each fork lists the 'network.git' as its alternate, and thus the
> > advertisement can get prohibitively large when there are many forks of a
> > repository.
> >
> > Michael Haggerty's work on packed refs makes finding references
> > pertaining only to the root computationally efficient, and [1] makes it
> > possible to filter down when computing the set of references to
> > advertise. With [1], we can specify that computation exactly and only
> > advertise branch tips from the root of a fork network.
> >
> > We've been slowly rolling this out to a handful of repository networks,
> > including forks of git.git hosted on GitHub. If you host your fork on
> > GitHub, you shouldn't notice anything. Hopefully, pushes to your fork
> > will result in smaller packfiles. In either case, nothing should break;
> > if it does, please feel free to email me, or support@xxxxxxxxxx.
>
> I tested this by updating 'master' in derrickstolee/git to match gitster/git
> and the pack was empty (ref update only). This makes fork management so much
> simpler!

Interesting. I'm glad to hear that it was working, but I took a
double-take at this paragraph since I see that 'derrickstolee/git' is
forked from 'git/git', not 'gitster/git'. I wasn't sure quite what was
going on, until I realized that 'git/git' and Junio's had an identical
'master'.

Even so, it shouldn't matter if they didn't, so long as 'git/git' was
ahead of 'gitster/git'. If Junio's fork was ahead, you would end up
pushing the new objects, and we'd immediately gc them away.

This makes me think about whether the situation could be improved,
perhaps by having your client tell GitHub that it has Junio's copy as a
remote, and then GitHub responding by also advertising Junio's branch
tips (if they are ahead of the network root).

This could feasibly even be implemented behind a v2 capability, but it
seems to reveal a lot of information about the pusher's setup, so
perhaps it would make sense to hide this behind a configuration option.

> Thanks!
> -Stolee

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux