In the previous change in this function, we add locale support for single-byte encodings only. It looks like pcre only supports utf-* as multibyte encodings, the others are left in the cold (which is fine). We need to enable PCRE_UTF8 so pcre can parse the string correctly before folding case. Noticed-by: Plamen Totev <plamen.totev@xxxxxx> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> --- grep.c | 2 ++ t/t7812-grep-icase-non-ascii.sh | 6 ++++++ 2 files changed, 8 insertions(+) diff --git a/grep.c b/grep.c index c79aa70..7c9e437 100644 --- a/grep.c +++ b/grep.c @@ -326,6 +326,8 @@ static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) if (opt->ignore_case) { p->pcre_tables = pcre_maketables(); + if (is_utf8_locale()) + options |= PCRE_UTF8; options |= PCRE_CASELESS; } diff --git a/t/t7812-grep-icase-non-ascii.sh b/t/t7812-grep-icase-non-ascii.sh index c945589..1306cc0 100755 --- a/t/t7812-grep-icase-non-ascii.sh +++ b/t/t7812-grep-icase-non-ascii.sh @@ -16,6 +16,12 @@ test_expect_success GETTEXT_LOCALE 'grep literal string, no -F' ' git grep -i "TILRAUN: HALLÓ HEIMUR!" ' +test_expect_success GETTEXT_LOCALE,LIBPCRE 'grep pcre string' ' + git grep --perl-regexp "TILRAUN: H.lló Heimur!" && + git grep --perl-regexp -i "TILRAUN: H.lló Heimur!" && + git grep --perl-regexp -i "TILRAUN: H.LLÓ HEIMUR!" +' + test_expect_success GETTEXT_LOCALE 'grep literal string, with -F' ' git grep --debug -i -F "TILRAUN: Halló Heimur!" 2>&1 >/dev/null | grep fixed >debug1 && -- 2.3.0.rc1.137.g477eb31 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html