Re: [PATCH v2 4/8] grep: consistently use "p->fixed" in compile_regexp()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 29 2019, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Jul 29 2019, Carlo Arenas wrote:
>
>> On Fri, Jul 26, 2019 at 8:09 AM Ævar Arnfjörð Bjarmason
>> <avarab@xxxxxxxxx> wrote:
>>>
>>> It's less confusing to use that variable consistently that switch back
>>> & forth between the two.
>>>
>>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
>>> ---
>>>  grep.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/grep.c b/grep.c
>>> index 9c2b259771..b94e998680 100644
>>> --- a/grep.c
>>> +++ b/grep.c
>>> @@ -616,7 +616,7 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>>>                 die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
>>>
>>>         pat_is_fixed = is_fixed(p->pattern, p->patternlen);
>>> -       if (opt->fixed || pat_is_fixed) {
>>> +       if (p->fixed || pat_is_fixed) {
>>
>> at the end of this series we have:
>>
>>   if (p->fixed || p->is_fixed)
>>
>> which doesn't make sense; at least with opt->fixed it was clear that
>> what was meant is that grep was passed -P
>
> I assume you mean "was passed -F...".
>
>> maybe is_fixed shouldn't exist and fixed when applied to the pattern
>> means we had determined it was a fixed
>> pattern and overridden the user selection of engine.
>
> They're two flags because p->fixed is "--fixed-strings", and p->is_fixed
> is "there's no metachars here". So the former case needs escaping, as
> the code just below might do (the two aren't mutually exclusive).
>
> I don't get how you think we can always fold them into one flag, but
> maybe I'm missing something...
>
>> that at least will give us a logical way to fix the pattern reported
>> in [1] and that currently requires the user to know
>> git's grep internals and know he can skip the "is_fixed" optimization
>> by doing something like :
>>
>>   $ git grep 'foo[ ]bar'
>>
>> [1] https://public-inbox.org/git/20190728235427.41425-1-carenas@xxxxxxxxx/
>
> As I noted in a reply there this seems like a way to fix a bug in "next"
> with a config knob. Yes we should fix the bug, but we've had the kwset
> code in git for years without needing this distinction, so after we work
> out the bugs I don't see why we'd need this.
>
> The reason we ignore the user's choice here is because you might
> e.g. set grep.patternType=extended in your config, and you'd still want
> grepping for a fixed "foo" to be fast.

...and more generally, for any future sanity of implementation and
maintenance I think we should only make the promise that we support
certain syntax & semantics, not that the -F, -G, -E, -P options are
guaranteed to dispatch to a given codepath.

Internally we should be free to switch between those, so e.g. if a
pattern is fixed and you configure "basic" regexp, but we know your C
library is faster for those matches with REG_EXTENDED we should just
pass that regardless of -G or -E.

Of course that means we *must* expose the same semantics (to some
reasonable extent), which means I have a lot of bugs in "next" to
address.

I'm just saying that the presence of those bugs means we should be
inclined to fix them / back out certain changes, not work around them
with user-servicable knobs.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux