Re: [PATCH v3 09/16] refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 13, 2023 at 05:27:05PM -0700, Junio C Hamano wrote:
> >   - Clean up the results from the previous step: discard empty regions,
> >     and combine adjacent regions.
>
> Say something like
>
>    The resulting list of regions that would never contain refs that
>    are not excluded is called the "jump list".
>
> here, probably.  Otherwise the first reference to "jump list" we see
> later feels a bit too abrupt.

Good suggestion. I phrased it slightly differently in the version that
I'll send shortly, but the spirit is the same.

> > There are a few other gotchas worth considering. First, note that the
> > jump list is sorted, so once we jump past a region, we can avoid
> > considering it (or any regions preceding it) again. The member
> > `jump_pos` is used to track the first next-possible region to jump
> > through.
> >
> > Second, note that the exclusion list is best-effort, since we do not
> > handle loose references, and because of the meta-character issue above.
>
> I found this a bit misleading; a natural reading of "exclusion list"
> is that the phrase refers to the list of exclude patterns given from
> the command line, and users would be upset if the processing of it
> is best effort.
>
> I think what you meant to say was that optimization to avoid full
> scan using the jump list does not aim for perfection, and entries
> that are not skipped using the jump list may still be excluded by
> the exclude patterns.

Yep, exactly. I think this was from an earlier version from before this
was called "jump lists", and I missed it when find-and-replacing
through the patch messages. Thanks for spotting.

> > In repositories with a large number of hidden references, the speed-up
>
> "hidden" -> "excluded".  Your final objective to implement the
> feature to exclude refs using patterns and optimize it using the
> jump list data may be to implement "hidden references", but that
> hasn't be talked about in the above.  All we have been hearing was
> that we are optimizing the walk over packed-refs using exclude
> patterns.

Yep, thanks.

> > +static const char *ptr_max(const char *x, const char *y)
> > +{
> > +	if (x > y)
> > +		return x;
> > +	return y;
> > +}
>
> Hopefully the compiler would inline the function without being told.
>
> These pointers point into the same mmapped region of contiguous
> memory that holds the contents of the packed-refs file, so
> comparison between them is always defined.  Good.
>
> I wondered if
>
> 	return (x > y) ? x : y;
>
> is easier to read, simply because it treats both cases more equally
> (in other words, as written, (x>y) appears more "special"), but that
> is minor.

Yeah, I think that any reasonable compiler would almost certainly inline
this, especially at higher optimization levels. But I agree with your
suggestion nonetheless, thanks.

> > +		/*
> > +		 * We can't feed any excludes with globs in them to the
> > +		 * refs machinery.  It only understands prefix matching.
> > +		 * We likewise can't even feed the string leading up to
> > +		 * the first meta-character, as something like "foo[a]"
> > +		 * should not exclude "foobar" (but the prefix "foo"
> > +		 * would match that and mark it for exclusion).
> > +		 */
> > +		if (has_glob_special(*pattern))
> > +			continue;
>
> OK.  When we have a "literal" exclude pattern and another "glob"
> exclude pattern, we can afford to ignore the "glob" one when
> building the jump list and include only the "literal" one, because
> the jump list is used only to skip over entries that obviously can
> never be in the result, *and* --exclude are additive (i.e. being
> on jump list because of the "literal" pattern is a reason enough to
> be excluded from the result; matching or not matching the other
> patterns does not affect the fate of the ref that got excluded due
> to matching the "literal" pattern).
>
> Makes sense.

Exactly.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux