Re: [PATCH 6/6] refs/reftable: wire up support for exclude patterns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick Steinhardt <ps@xxxxxx> writes:

> On Fri, Sep 13, 2024 at 07:47:06AM -0500, karthik nayak wrote:
>> Patrick Steinhardt <ps@xxxxxx> writes:
>> > +/*
>> > + * Handle exclude patterns. Returns either `1`, which tells the caller that the
>> > + * current reference shall not be shown. Or `0`, which indicates that it should
>> > + * be shown.
>> > + */
>> > +static int should_exclude_current_ref(struct reftable_ref_iterator *iter)
>> > +{
>> > +	while (iter->exclude_patterns[iter->exclude_patterns_index]) {
>> > +		const char *pattern = iter->exclude_patterns[iter->exclude_patterns_index];
>> > +		char *ref_after_pattern;
>> > +		int cmp;
>> > +
>> > +		/*
>> > +		 * Lazily cache the pattern length so that we don't have to
>> > +		 * recompute it every time this function is called.
>> > +		 */
>> > +		if (!iter->exclude_patterns_strlen)
>> > +			iter->exclude_patterns_strlen = strlen(pattern);
>> > +
>> > +		/*
>> > +		 * When the reference name is lexicographically bigger than the
>> > +		 * current exclude pattern we know that it won't ever match any
>> > +		 * of the following references, either. We thus advance to the
>> > +		 * next pattern and re-check whether it matches.
>>
>> So this means that the exclude patterns were lexicographically sorted.
>> Otherwise this would work.
>
> Indeed. Good that you call out my assumption, as I in fact didn't verify
> that it holds, and in fact it doesn't. It's not a correctness issue if
> it doesn't hold, because it would simply mean that we don't skip over
> some references where we really could. But it certainly is a perfomance
> issue.
>
> Will fix and add a test for it.
>
>> > +		 * Note that the seeked-to reference may also be excluded. This
>> > +		 * is not handled here though, but the caller is expected to
>> > +		 * loop and re-verify the next reference for us.
>> > +		 */
>>
>> The seeked-to reference here being the one with 0xff. We could get rid
>> of this by doing something like this:
>>
>>     int last_char_idx = iter->exclude_patterns_strlen - 1
>>     ref_after_pattern = xstrfmt("%s", pattern);
>>     ref_after_pattern[last_char_idx] = ref_after_pattern[last_char_idx] + 1;
>>
>> instead no?
>
> Sorry, I don't quite follow what you mean with "get rid of this". What
> exactly is "this"? Do you mean the re-looping?
>
> If so then the above doesn't fix it, no. We'd have to repeat a whole lot
> of code here to also retrieve the next entry, store it into `iter->ref`,
> check whether it is an actual ref starting with "refs/" and so on.
> Looping once very much feels like the better thing to do.
>

I definitely responded to the wrong chunk of your comment. I meant to
respond to

> * This is done by appending the highest possible character to
> * the pattern. Consequently, all references that have the
> * pattern as prefix and whose suffix starts with anything in
> * the range [0x00, 0xfe] are skipped. And given that 0xff is a
> * non-printable character that shouldn't ever be in a ref name,
> * we'd not yield any such record, either.

My point being, we can even skip "pattern + 0xff" by just setting the
seek to ref to be the next character in the pattern. But since "pattern
+ 0xff" is anyways not an expected ref. They should behave the same.

>> > @@ -580,9 +660,45 @@ static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
>> >  	.abort = reftable_ref_iterator_abort
>> >  };
>> >
>> > +static char **filter_exclude_patterns(const char **exclude_patterns)
>> > +{
>> > +	size_t filtered_size = 0, filtered_alloc = 0;
>> > +	char **filtered = NULL;
>> > +
>> > +	if (!exclude_patterns)
>> > +		return NULL;
>> > +
>> > +	for (size_t i = 0; ; i++) {
>> > +		const char *exclude_pattern = exclude_patterns[i];
>> > +		int has_glob = 0;
>> > +
>> > +		if (!exclude_pattern)
>> > +			break;
>> > +
>> > +		for (const char *p = exclude_pattern; *p; p++) {
>> > +			has_glob = is_glob_special(*p);
>> > +			if (has_glob)
>> > +				break;
>> > +		}
>>
>> Why do we need to filter excludes here? Don't the callee's already do
>> something like this?
>
> No, it doesn't. The code for exclude patterns is structured in such a
> way that the responsibility is with the backend to decide what it can
> and cannot filter. In theory there could be a backend that can exclude
> refs based on globs efficiently, even though neither the "files" nor the
> "reftable" backend can.

Thanks for the explanataion.

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux