Re: [RFH] limiting ref advertisements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 15, 2016 at 4:21 AM, Jeff King <peff@xxxxxxxx> wrote:
> Thanks for responding to this.

Glad to help (or more precisely annoy you somewhat :D)

> I've been meaning to get back to it with
> some code experiments, but they keep getting bumped down in priority. So
> let me at least outline some of my thoughts, without code. :)
>
> I was hoping to avoid right-anchoring because it's expensive to find all
> of the right-anchored cases (assuming that ref storage is generally
> hierarchical, which it is now and probably will be for future backends).

Urgh.. I completely forgot about future refs backends. Yeah this would
kick both wildmatch and right-anchoring out of the window.

For the record I almost suggested BPF as well (to keep the server side
simple, but at the super high cost of client side). That would also go
out of the window the same way wildmatch and right-anchoring does.

>      But remember that these are "early capabilities", before the server
>      has spoken at all. So the client doesn't know if we can handle v2.
>      So we have to send _both_ (and v2-aware servers can ignore the v1).
>
>        advertise-lookup-v1=master
>        advertise-lookup-v2=master
>
>      But that's not quite enough. A v1 server won't look in refs/notes
>      at all. So we have to say that, too:
>
>        advertise-lookup-v1=refs/notes/master
>
>      And of course the v1 server has no idea that this isn't necessary
>      if we already found refs/heads/master.

We discussed a bit about upgrading upload-pack version 1 to 2 in more
than one session: the first fetch just does v1 as normal, the server
returns v1 response to but also advertises that v2 is supported. The
client keeps this info and skips v1 and tries v2 right away in the
following fetches, falling back to v1 (new fetch session) if v2 is
unsupported. Can it work the same way here too?

I'm in favor of this option 2 (without trying to be absolute backward
compatible with older lookup versions) since it allows us to optimize
for common case and experiment a bit. Once we know better we can make
the next version that hopefully suits everybody.

>      So I think you really do need the client to be able to say "also
>      look at this pattern".

What about the order of patterns? Does it matter "this pattern" is in
the middle or the end of the pattern list? I suppose not, just
checking...

But does this call for the ability to remove a pattern from the
pattern list as well, as a way to narrow down the search scope and
avoid sending unwanted refs?

> Of course we do still want left-anchoring, too. Wildcards like
> "refs/heads/*" are always left-anchored. So I think we'd have two types,
> and a full request for
>
>   git fetch origin +refs/heads/*:refs/remotes/origin/* master:foo
>
> would look like:
>
>   (1) advertise-pattern-v1
>   (2) advertise-pattern=refs/notes/%s
>   (3) advertise-prefix=refs/heads
>   (4) advertise-lookup=master
>
> where the lines mean:
>
>   1. Use the standard v1 patterns (we could spell them out, but this
>      just saves bandwidth. In fact, it could just be implicit that v1
>      patterns are included, and we could skip this line).
>
>   2. This is for our fictional future version where the client knows
>      added refs/notes/* to its DWIM but the server hasn't yet.
>
>   3. Give me all of refs/heads/*
>
>   4. Look up "master" using the advertise patterns and give me the first
>      one you find.

Well.. it sounds good to me. But I would not trust myself on refs matters :D

> So given that we can omit (1), and that (2) is just an example for the
> future, it could look like:
>
>   advertise-prefix=refs/heads
>   advertise-lookup=master
>
> which is pretty reasonable. It's not _completely_ bulletproof in terms
> of backwards compatibility. The "v1" thing means the client can't insert
> a new pattern in the middle (remember they're ordered by priority).

OK so pattern order probably matters...

> So
> maybe it is better to spell them all out (one thing that makes me
> hesitate is that these will probably end up as URL parameters for the
> HTTP version, which means our URL can start to get a little long).
>
> Anyway. That's the direction I'm thinking. I haven't written the code
> yet. The trickiest thing will probably be that the server would want to
> avoid advertising the same ref twice via two mechanisms (or perhaps the
> client just be tolerant of duplicates; that relieves the server of any
> duplicate-storage requirements).

Thanks for sharing.
-- 
Duy



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]