Re: [PATCH 14/25] pickaxe -S: remove redundant "sz" check in while-loop

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 04 Feb 2021 09:56:17 -0800

René Scharfe <l.s.r@xxxxxx> writes:

>> -		while (sz &&
>> -		       !regexec_buf(regexp, data, sz, 1, &regmatch, flags)) {
>> +		while (!regexec_buf(regexp, data, sz, 1, &regmatch, flags)) {
>
> This will loop forever for regexes that match an empty string.  An
> example would be /$/.  Silly, perhaps, but still I understand this check
> less as an optimization and more as a correctness/robustness thing.
>
>>  			flags |= REG_NOTBOL;
>>  			data += regmatch.rm_eo;
>>  			sz -= regmatch.rm_eo;
>> -			if (sz && regmatch.rm_so == regmatch.rm_eo) {
>> +			if (regmatch.rm_so == regmatch.rm_eo) {
>>  				data++;
>>  				sz--;
>>  			}
>
> Before, if the match was an empty string and there was more data after
> it, then the code would consume a character anyway, in order to avoid
> matching the same empty string again.  With the patch, that character
> is consumed even if there is no more data.  This leaves 'data'
> pointing beyond the buffer and 'sz' rolls over to ULONG_MAX.  Oops. :(

While I do not care too much about NUL in the haystack, I do not
mind [13/25] either.  But this is bad.

This whole thing reminds me of f53c5de2 (pickaxe: fix segfault with
'-S<...> --pickaxe-regex', 2017-03-18), by the way.

Thanks.