Re: [PATCH v2 6/8] grep: stess test PCRE v2 on invalid UTF-8 data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 26 2019, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:
>
>> diff --git a/grep.c b/grep.c
>> index 6d60e2e557..5bc0f4f32a 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -615,6 +615,16 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>>  		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
>>
>>  	p->is_fixed = is_fixed(p->pattern, p->patternlen);
>> +#ifdef USE_LIBPCRE2
>> +       if (!p->fixed && !p->is_fixed) {
>> +	       const char *no_jit = "(*NO_JIT)";
>> +	       const int no_jit_len = strlen(no_jit);
>> +	       if (starts_with(p->pattern, no_jit) &&
>> +		   is_fixed(p->pattern + no_jit_len,
>> +			    p->patternlen - no_jit_len))
>> +		       p->is_fixed = 1;
>
> It is unfortunate that is_fixed() takes a counted string.
> Otherwise, using skip_prefix() to avoid "+no_jit_len" would have
> made it much easier to read. i.e.
>
> 	/* an illustration that does not quite work */
> 	char *pattern_body;
> 	if (skip_prefix(p->pattern, "(*NO_JIT)", &pattern_body) &&
>             is_fixed(pattern_body))
> 		p->is_fixed = 1;

Indeed, but then we couldn't use this for patterns that have NUL in
them, which we otherwise support (and support here). So I think it's
worth keeping it so it takes ptr+len.

>> +test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: setup invalid UTF-8 data' '
>> +	printf "\\200\\n" >invalid-0x80 &&
>> +	echo "ævar" >expected &&
>> +	cat expected >>invalid-0x80 &&
>> +	git add invalid-0x80
>> +'
>> +
>> +test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: grep ASCII from invalid UTF-8 data' '
>> +	git grep -h "var" invalid-0x80 >actual &&
>> +	test_cmp expected actual &&
>> +	git grep -h "(*NO_JIT)var" invalid-0x80 >actual &&
>> +	test_cmp expected actual
>> +'
>> +
>> +test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: grep non-ASCII from invalid UTF-8 data' '
>> +	test_might_fail git grep -h "æ" invalid-0x80 >actual &&
>> +	test_cmp expected actual &&
>> +	test_must_fail git grep -h "(*NO_JIT)æ" invalid-0x80 &&
>> +	test_cmp expected actual
>> +'
>> +
>> +test_expect_success GETTEXT_LOCALE,LIBPCRE2 'PCRE v2: grep non-ASCII from invalid UTF-8 data with -i' '
>> +	test_might_fail git grep -hi "Æ" invalid-0x80 >actual &&
>> +	test_cmp expected actual &&
>> +	test_must_fail git grep -hi "(*NO_JIT)Æ" invalid-0x80 &&
>> +	test_cmp expected actual
>> +'
>> +
>>  test_done




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux