Re: [PATCH v4 2/2] grep/pcre2: better support invalid UTF-8 haystacks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 24/01/2021 14:49, Ævar Arnfjörð Bjarmason wrote:
> 
> On Sun, Jan 24 2021, Ramsay Jones wrote:
> 
>> On 24/01/2021 13:53, Ramsay Jones wrote:
>> [snip]
>>
>>>> diff --git a/grep.c b/grep.c
>>>> index efeb6dc58d..e329f19877 100644
>>>> --- a/grep.c
>>>> +++ b/grep.c
>>>> @@ -492,7 +492,13 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>>>>  	}
>>>>  	if (!opt->ignore_locale && is_utf8_locale() && has_non_ascii(p->pattern) &&
>>>>  	    !(!opt->ignore_case && (p->fixed || p->is_fixed)))
>>>> -		options |= PCRE2_UTF;
>>>> +		options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF);
>>>> +
>>>> +	if (PCRE2_MATCH_INVALID_UTF &&
>>>> +	    options & (PCRE2_UTF | PCRE2_CASELESS) &&
>>>> +	    !(PCRE2_MAJOR >= 10 && PCRE2_MAJOR >= 36))
>>>                                    ^^^^^^^^^^^^^^^^^^
>>> I assume that this should be s/_MAJOR/_MINOR/. ;-)
>>>
> 
> Oops on the s/MAJOR/MINOR/g. Well spotted, I think I'll wait a bit more
> for other comments for a re-roll.
> 
> Perhaps Junio can be kind and do the s/_MAJOR/_MINOR/ fixup in the
> meantime to save be from spamming the list too much...

Umm, sorry for not making myself clear, _just_ changing MAJOR to
MINOR is insufficient.

> 
> FWIW I have tested this on a verion without PCRE2_MATCH_INVALID_UTF, but
> I think I did that by manually editing the "PCRE2_UTF" line above, and
> then wrote this bug.

Yep, I seem to have 10.34 on Linux Mint 20.1 (based on Ubuntu 20.04).

> 
>> Although, perhaps you want:
>>
>>             !(((PCRE2_MAJOR * 100) + PCRE2_MINOR) >= 1036)
>>
>> ... or something similar.
> 
> Probably better to use pcre2_config(PCRE2_CONFIG_VERSION) at that point
> and versioncmp() the string.

OK, but it needs to be 'something similar' (try putting, say, MAJOR 11
and MINOR 0->35 in your expression).

ATB,
Ramsay Jones



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux