Re: [PATCH v2 09/16] refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 06, 2023 at 09:00:52AM +0200, Patrick Steinhardt wrote:
> > @@ -785,6 +802,13 @@ struct packed_ref_iterator {
> >  	/* The end of the part of the buffer that will be iterated over: */
> >  	const char *eof;
> >
> > +	struct jump_list_entry {
> > +		const char *start;
> > +		const char *end;
> > +	} *jump;
> > +	size_t jump_nr, jump_alloc;
> > +	size_t jump_pos;
> >
> Nit: I had some trouble with `jump_pos` given that it sounds so similar
> to `iter->pos`, and thus you tend to think that they both apply to the
> position in the packed-refs file. `jump_curr` or `jump_idx` might help
> to avoid this confusion.

Very fair, thanks for observing. I went with "jump_cur" (as a shorthand
for "cursor").

> > +	for (pattern = excluded_patterns; *pattern; pattern++) {
> > +		struct jump_list_entry *e;
> > +
> > +		/*
> > +		 * We can't feed any excludes with globs in them to the
> > +		 * refs machinery.  It only understands prefix matching.
> > +		 * We likewise can't even feed the string leading up to
> > +		 * the first meta-character, as something like "foo[a]"
> > +		 * should not exclude "foobar" (but the prefix "foo"
> > +		 * would match that and mark it for exclusion).
> > +		 */
> > +		if (has_glob_special(*pattern))
> > +			continue;
> > +
> > +		ALLOC_GROW(iter->jump, iter->jump_nr + 1, iter->jump_alloc);
> > +
> > +		e = &iter->jump[iter->jump_nr++];
> > +		e->start = find_reference_location(snapshot, *pattern, 0);
> > +		e->end = find_reference_location_end(snapshot, *pattern, 0);
>
> Nit: we could detect the non-matching case here already, which would
> allow us to skip an allocation. It's probably pre-mature optimization
> though, so please feel free to ignore.

Probably so, this allocation is so lightweight in comparison to all of
the other things that for-each-ref does throughout its execution that I
think it's probably negligible to shave off a few allocations.

> > +	}
> > +
> > +	if (!iter->jump_nr) {
> > +		/*
> > +		 * Every entry in exclude_patterns has a meta-character,
> > +		 * nothing to do here.
> > +		 */
> > +		return;
> > +	}
> > +
> > +	QSORT(iter->jump, iter->jump_nr, jump_list_entry_cmp);
> > +
> > +	/*
> > +	 * As an optimization, merge adjacent entries in the jump list
> > +	 * to jump forwards as far as possible when entering a skipped
> > +	 * region.
> > +	 *
> > +	 * For example, if we have two skipped regions:
> > +	 *
> > +	 *	[[A, B], [B, C]]
> > +	 *
> > +	 * we want to combine that into a single entry jumping from A to
> > +	 * C.
> > +	 */
> > +	last_disjoint = iter->jump;
>
> Nit: if we initialized `j = 0`, then `last_disjoint` would always be
> equal to `iter->jump[j]`. We could then declare the variable inside of
> the loop to make it a bit easier to understand.

Sure, though we would then need to assign `iter->jump_nr = j + 1`, which
I think adds more confusion than inlining the variable is worth.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux