Re: [PATCH v8 09/10] grep: simplify config parsing and option parsing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 18 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:
>
>> When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
>> grep.patternType configuration setting, 2012-08-03) we promised that:
>>
>>  1. You can set "grep.patternType", and "[setting it to] 'default'
>>     will return to the default matching behavior".
>>
>>     In that context "the default" meant whatever the configuration
>>     system specified before that change, i.e. via grep.extendedRegexp.
>>
>>  2. We'd support the existing "grep.extendedRegexp" option, but ignore
>>     it when the new "grep.patternType" option is set. We said we'd
>>     only ignore the older "grep.extendedRegexp" option "when the
>>     `grep.patternType` option is set. to a value other than
>>     'default'".
>
> Extra period in the middle of a sentence after "set".
>
>> As before both "grep.patternType" and "grep.extendedRegexp" are
>> last-one-wins variable, with "grep.extendedRegexp" yielding to
>> "grep.patternType", except when "grep.patternType=default".
>>
>> Note that this applies as we parse the config, i.e. a sequence of:
>>
>>     -c grep.patternType=perl
>>     -c grep.extendedRegexp=true \
>>     -c grep.patternType=default
>>
>> Should select ERE due to "grep.extendedRegexp=true and
>
> Downcase "S" in "should", as this is still in the middle of the
> sentence that began with "Note that".
>
>> grep.extendedRegexp=default", not BRE, even though that's the
>
> The second one should be "grep.patternType=default".

*nod*

>> "default" patternType. We can determine this as we parse the config,
>
> Drop "even though that's the default patternType".  You've already
> explained that it is not what "default" for the "patternType" (which
> any reader who has been following so far would take as a reference
> to "grep.patternType") at all.  You can also drop ", not BRE," while
> doing so.
>
>> because:
>>
>>  * If we see "grep.extendedRegexp" we set the internal "ero" to its
>>    boolean value.
>>
>>  * If we see "grep.extendedRegexp" but
>>    "grep.patternType=[default|<unset>]" is in effect we *don't* set
>>    the internal "pattern_type_option" to update the pattern type.
>>
>>  * If we see "grep.patternType!=default" we can set our internal
>>    "pattern_type_option" directly, it doesn't matter what the state of
>>    "grep.extendedRegexp" is, but we don't forget what it was, in case
>>    we see a "grep.patternType=default" again.
>>
>>  * If we see a "grep.patternType=default" we can set the pattern to
>>    ERE or BRE depending on whether we last saw a
>>    "grep.extendedRegexp=true" or
>>    "grep.extendedRegexp=[false|<unset>]".
>
> That sounds complex enough, doesn't it?  The statement that opens
> the proposed log mesage is "gets rid of complex parsing logic that
> isn't needed", but the above sounds more like a complex logic is
> being traded with another.

The complexity this series is addressing is that you couldn't treat this
like most other built-in / library APIs, where we instantiate defaults,
do config, then CLI parsing.

>> diff --git a/grep.c b/grep.c
>> index 60228a95a4f..bb487e994d0 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
>>  
>>  define_list_config_array_extra(color_grep_slots, {"match"});
>>  
>> +static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
>> +{
>> +	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
>> +		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
>> +}
>> +
>>  /*
>>   * Read the configuration file once and store it in
>>   * the grep_defaults template.
>> @@ -56,17 +62,22 @@ int grep_config(const char *var, const char *value, void *cb)
>>  {
>>  	struct grep_opt *opt = cb;
>>  	const char *slot;
>> +	static int ero = -1;
>
> Is this new reentrancy issue worth it?  I think it makes the whole
> thing unnecessarily complex, compared to a more naïve "we keep track
> of the last-one-that-won for grep.extendedRegexp and
> grep.patternType separately during option and config parsing inside
> the grep_opt structure, and then combine the two when we compile the
> pattern string into regexp or pcre object" approach.

I can move back to using the variable in the struct. The post-image here
is from incrementally working on that, until I saw that it wasn't needed
outside the config parsing step itself.

Is a reentrancy issue a practical concern? This part of the grep API is
explicitly called by the whole init-once/config-once/getopt-once step in
builtin/grep.c (and revision.c).

> Let's ask it in a different way.  What is this static, that is way
> separated from all the members in the grep_opt structure, buying us?
> Certainly not the ease of understanding what the code is doing.  Not
> the size of the overall grep_opt structure (which we do not allocate
> tons anyway).  Other than that fact that you can say "I did it my
> own way, ignoring reviewer suggestions, I won!!!", I do not see any
> upside with this code.

The ease of understanding that the state isn't needed outside of the
config callback, but that's in the eye of the beholder I suppose. That's
not true of the other remaining struct members.

Then downthread you mention:

> Another problem is that there are those corporate server-side folks
> who are interested in giving an endpoint that lets clients to ask
> performing Git operations (like grep and blame).  Adding more statics
> instead of keeping track of dynamic runtime structure like grep_opt
> is deliberately making things more difficult for them, isn't it?

IIRC I'm the only person who's been advocating that as an eventual neat
thing to do, and I don't see how this would make it harder.

If we had a grep IPC call of some sort we'd surely limit it to the
"modern" config of grep.patternType, and take the opportunity to
deprecate grep.extendedRegexp from that interface.

Or it would take explicit parameters, instead of allowing the passing of
config from a remote process.

In either case the required surgery and scaffolding to make that work
will at most be trivially impacted by these changes, and likely not at
all.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux