Re: [PATCH 2/2] ls-refs.c: traverse longest common ref prefix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 19, 2021 at 06:09:42PM -0500, Jeff King wrote:
> On Tue, Jan 19, 2021 at 01:19:17PM -0500, Taylor Blau wrote:
>
> > To attempt to reduce the difference between the number of refs
> > traversed, and the number of refs sent, only traverse references which
> > are in the longest common prefix of the given prefixes. This is very
> > reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
> > disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
> > multi-patterned 'git for-each-ref' invocations.
> >
> > The only difference here is that we are operating on ref prefixes, which
> > do not necessarily point to a single reference. That is just fine, since
> > all we care about is finding the longest common prefix among prefixes
> > which can be thought of as refspecs for our purposes here.
>
> This second paragraph confused me. Aren't the inputs to for-each-ref
> also prefixes?
>
> I guess they require an explicit '*', but fundamentally it's the same
> concept (and certainly they are not just single references).

Yeah, that is the point that I was trying to make. But re-reading this
patch after knowing that it confused you, I think the clearest way to
make that point is to drop that second paragraph entirely.

> > Similarly, for_each_fullref_in_prefixes may return more results than the
> > caller asked for (since the longest common prefix might match something
> > that a longer prefix in the same set wouldn't match) but
> > ls-refs.c:send_ref() discards such results.
>
> Based on my other poking, I'm not entirely sure that we can return too
> many results. But I do think it's worth keeping the caller more careful.

It can return more results, but I don't think that my writing in
b31e2680c4 is particularly clear. Here's an example, though. Say I ask
for `git for-each-refs 'refs/tags/a/*' 'refs/tags/a/b/c'`. The LCP of
that is definitely "refs/tags/a", which might traverse other stuff like
"refs/tags/a/b/d", which wouldn't get matched by either.

> > The code introduced in b31e2680c4 is resilient to stop early (and
> > return a shorter prefix) when it encounters a metacharacter (as
> > mentioned in that patch, there is some opportunity to improve this, but
> > nobody has done it).
>
> I agree that is how b31e2680c4 works, but we don't care about that
> behavior here, since we have strict prefixes. Isn't the argument we need
> to make the other way? I.e., that stopping early on a metacharacter will
> not hurt us. Because at worst we return too many results (hey, there's a
> case!) and because we would not expect metacharacters in the prefixes
> anyway, as they are illegal in refnames.

Yeah, thinking on it more I agree that's the argument we should be
making here. I updated the patch to reflect it.

> > There are two remaining small items in this patch:
> >
> >   - If no prefixes were provided, then implicitly add the empty string
> >     (which will match all references).
>
> I wonder if for_each_fullref_in_prefixes() should do that, since that is
> exactly what the other caller does, too. OTOH, maybe it is better to
> make the callers be more explicit. In which case should it maybe BUG()
> on an empty set of prefixes, rather than letting the caller assume some
> behavior?

Hmm. I don't feel strongly either way, but I think that the BUG is
probably the most sensible option.

> >   - Since we are manually munging the prefixes, make sure that we
> >     initialize it ourselves (previously this wasn't necessary since the
> >     first strvec_push would do so).
>
> I think this was an existing bug (that we were just lucky it was
> impossible to trigger, because nobody looked for the NULL sentinel), and
> would make more sense as a separate fix.

Right. I'll make sure to pull this one out into a separate patch and
credit Jacob with authorship.

> -Peff

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux