Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> writes: > In the previous change in this function, we add locale support for > single-byte encodings only. It looks like pcre only supports utf-* as > multibyte encodings, the others are left in the cold (which is > fine). We need to enable PCRE_UTF8 so pcre can parse the string > correctly before folding case. > if (opt->ignore_case) { > p->pcre_tables = pcre_maketables(); > + if (is_utf8_locale()) > + options |= PCRE_UTF8; > options |= PCRE_CASELESS; > } We need to set the PCRE_UTF8 flag in all cases when the locale is UTF-8 not only when the search is case insensitive. Otherwise pcre threats the encoding as single byte and if the regex contains quantifiers it will not work as expected. The quantifier will try to match the second byte of the multi-byte symbol instead of the whole symbol. For example lets have file that contains the string TILRAUN: HALLÓÓÓ HEIMUR! the following command git grep -P "HALLÓ{3}" will not match the file while git grep -P "HAL{2}ÓÓÓ" will. That's because the L symbol is a single byte. Regards, Plamen Totev -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html