Re: [PATCH v9 9/9] grep: simplify config parsing and option parsing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:

> In a preceding commit we changed grep_config() to be called after
> grep_init(), which means that much of the complexity here can go
> away.
>
> As before both "grep.patternType" and "grep.extendedRegexp" are
> last-one-wins variable, with "grep.extendedRegexp" yielding to
> "grep.patternType", except when "grep.patternType=default".
>
> Note that this applies as we parse the config, i.e. a sequence of:
>
>     -c grep.patternType=perl
>     -c grep.extendedRegexp=true \
>     -c grep.patternType=default
>
> should select ERE due to "grep.extendedRegexp=true and
> grep.patternType=default".

Correct.  The same reasoning would apply to:

	-c grep.extendedRegexp=true \ 
	-c grep.patternType=perl \
	-c grep.patternType=default

which should also select ERE due to "grep.extendedRegexp=true and
grep.patternType=default".  As we re-check grep.extendedRegexp every
time grep.patternType changes, and we keep extendedRegexp as a
separate copy without getting lost like earlier rounds of this
series, this should work.

Would this also work?

	-c grep.extendedRegexp=false \
	-c grep.patternType=default \
	-c grep.extendedRegexp=true

We do keep extendedRegexp, but as soon as we read .patternType that
is default, adjust_pattern_type() overwrites the pattern_type_option
member with BRE, and the fact that .patternType was specified as "do
whatever the .extendedRegexp says" is lost when we read the third
one.

So, no, I am not sure this is correct.

> diff --git a/grep.c b/grep.c
> index 60228a95a4f..f07a21ff1aa 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
>  
>  define_list_config_array_extra(color_grep_slots, {"match"});
>  
> +static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
> +{
> +	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
> +		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
> +}

OK.  Earlier rounds just replaced the UNSPECIFIED with hardcoded value "0",
which was more or less pointless.  I think this is easier to follow.

But as I said, "committing" ERE vs BRE in this manner is probably
way too early and produce an incorrect result.  Instead ...

> @@ -490,9 +446,9 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)

... this is the right place to do the "see if pattern_type_option is
'default' and if so use 'extended_regexp_option' to commit to either
BRE or ERE".

I guess that is what I have been repeating during the review of the
past few rounds.  Am I overlooking some other cases where that
simpler-to-explain approach does not work?

>  	p->word_regexp = opt->word_regexp;
>  	p->ignore_case = opt->ignore_case;
> -	p->fixed = opt->fixed;
> +	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
>  
> -	if (!opt->pcre2 &&
> +	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
>  	    memchr(p->pattern, 0, p->patternlen))
>  		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
>  
> @@ -544,14 +500,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>  		return;
>  	}
>  
> -	if (opt->pcre2) {
> +	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
>  		compile_pcre2_pattern(p, opt);
>  		return;
>  	}
>  
>  	if (p->ignore_case)
>  		regflags |= REG_ICASE;
> -	if (opt->extended_regexp_option)
> +	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
>  		regflags |= REG_EXTENDED;
>  	err = regcomp(&p->regexp, p->pattern, regflags);
>  	if (err) {
> diff --git a/grep.h b/grep.h
> index 89a2ce51130..f89324e9aa9 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -143,7 +143,6 @@ struct grep_opt {
>  	int unmatch_name_only;
>  	int count;
>  	int word_regexp;
> -	int fixed;
>  	int all_match;
>  	int no_body_match;
>  	int body_hit;
> @@ -154,7 +153,6 @@ struct grep_opt {
>  	int allow_textconv;
>  	int extended;
>  	int use_reflog_filter;
> -	int pcre2;
>  	int relative;
>  	int pathname;
>  	int null_following_name;
> @@ -183,6 +181,7 @@ struct grep_opt {
>  	.relative = 1, \
>  	.pathname = 1, \
>  	.max_depth = -1, \
> +	.extended_regexp_option = -1, \

I do not think you meant this.  Uninitialized grep.extendedRegexp
defaults to 0 (BRE), I think.

>  	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
>  	.colors = { \
>  		[GREP_COLOR_CONTEXT] = "", \
> @@ -202,7 +201,6 @@ struct grep_opt {
>  
>  int grep_config(const char *var, const char *value, void *);
>  void grep_init(struct grep_opt *, struct repository *repo);
> -void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
>  
>  void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
>  void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
> diff --git a/revision.c b/revision.c
> index d6e0e2b23b5..dd301e30147 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
>  
>  	diff_setup_done(&revs->diffopt);
>  
> -	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
> -				 &revs->grep_filter);
>  	if (!is_encoding_utf8(get_log_output_encoding()))
>  		revs->grep_filter.ignore_locale = 1;
>  	compile_grep_patterns(&revs->grep_filter);




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux