Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 24, 2018 at 08:17:14AM -0700, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > I was suggesting that check_everything_connected() is not strictly
> > transport-related, so would be inappropriate for transport.*, and we'd
> > need a more generic name. And my "either way" was that I could see
> > an argument that it _is_ transport related, since we only call it now
> > when receiving a pack. But that doesn't have to be the case, and
> > certainly implementing it with "rev-list --alternate-refs" muddies that
> > considerably.
> 
> Even after 7043c707 ("check_everything_connected: use a struct with
> named options", 2016-07-15) unified many into check_connected(),
> there still are different reasons why we call to find out about the
> connectivity, and I doubt we can afford to have a single knob that
> is shared both for transport and other kind of connectivity checks
> (like fsck or repack).  Do we want to be affected by "we pretend
> that these are the only refs exported from that alternate object
> store" when repacking and pruning only local objects and keep us
> rely on the alternate, for example?

Actually, yes, I think there is value in a single knob. At least that's
what I'd want for our (GitHub's) use case.

Remember that these alternate refs might not exist at all (the
alternates mechanism can work with just a bare "objects" directory,
unconnected from a real git repo). So I think anything using them has to
view it as a "best effort" optimization: we might or might not know
about some ref tips that might or might not cover the whole set of
objects in the alternate. They're the things we _guarantee_ that the
alternate has full connectivity for, and it might have more.

So I think it's conceptually consistent to always show a subset. I did
qualify with "for our use case" because some people might be primarily
concerned with the bandwidth of sending .haves across the network.
Whereas at our scale, even enumerating them at all is prohibitively
expensive.

One thing we could do is add a "core" config now (whether it's in core.*
or wherever). And then if later somebody wants receive-pack to behave
differently, we have an out: we can add transfer.alternateRefsCommand or
even receive.alternateRefsCommand that take precedence in those
situations.

Of course we could add the more restricted ones now, and add the "core"
one later as new uses grow. But that's more work now, since we'd have to
plumb through that context to the for_each_alternate_ref() interface.
I'd rather punt on that work until later (because I suspect that "later"
will never actually come).

> In any case it is good that these configuration variables are
> defined on _our_ side, not in the alternate---it means that we do
> not have to worry about the case where the alternateRefsCommand lies
> and tells us that an object that the alternate does not actually
> have exists at a tip of a ref in an attempt to confuse us, etc.

Yes. It also makes it easy to use "git -c" to override the scheme if you
want to (as opposed to mucking with on-disk files in the alternate).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux