On Sun, Jan 24 2021, Ramsay Jones wrote: > On 24/01/2021 13:53, Ramsay Jones wrote: > [snip] > >>> diff --git a/grep.c b/grep.c >>> index efeb6dc58d..e329f19877 100644 >>> --- a/grep.c >>> +++ b/grep.c >>> @@ -492,7 +492,13 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt >>> } >>> if (!opt->ignore_locale && is_utf8_locale() && has_non_ascii(p->pattern) && >>> !(!opt->ignore_case && (p->fixed || p->is_fixed))) >>> - options |= PCRE2_UTF; >>> + options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF); >>> + >>> + if (PCRE2_MATCH_INVALID_UTF && >>> + options & (PCRE2_UTF | PCRE2_CASELESS) && >>> + !(PCRE2_MAJOR >= 10 && PCRE2_MAJOR >= 36)) >> ^^^^^^^^^^^^^^^^^^ >> I assume that this should be s/_MAJOR/_MINOR/. ;-) >> Oops on the s/MAJOR/MINOR/g. Well spotted, I think I'll wait a bit more for other comments for a re-roll. Perhaps Junio can be kind and do the s/_MAJOR/_MINOR/ fixup in the meantime to save be from spamming the list too much... FWIW I have tested this on a verion without PCRE2_MATCH_INVALID_UTF, but I think I did that by manually editing the "PCRE2_UTF" line above, and then wrote this bug. > Although, perhaps you want: > > !(((PCRE2_MAJOR * 100) + PCRE2_MINOR) >= 1036) > > ... or something similar. Probably better to use pcre2_config(PCRE2_CONFIG_VERSION) at that point and versioncmp() the string. FWIW we used this pattern (I added) before with PCRE, see e87de7cab4 (grep: un-break building with PCRE < 8.32, 2017-05-25). And if you want to be clever ((PCRE2_MAJOR) | ((PCRE2_MINOR) << 16)) is probably better, then you can get it out with bit operations :)