On Tue, Feb 28, 2012 at 04:20:30PM -0800, Junio C Hamano wrote: > In order to prepare the kwset machinery for a case-insensitive search, we > used to use a static table of 256 elements and filled it every time before > calling kwsalloc(). Because the kwset machinery will never modify this > table, just allocate a single instance globally and fill it at the compile > time. Hmm. I was going to complain that the original code used tolower() to generate the table at run-time, and therefore respected the current locale. But of course we have replaced tolower() with a locale-independent version, so it should behave identically. But that does make me wonder. Do people expect their case-insensitive searches to work on non-ASCII characters? I would think yes, but I do not use non-ASCII characters in the first place, so my opinion may not mean much. For that matter, does REG_ICASE respect locales? The glibc code appears to consider it, but I couldn't make it work in some simple tests. But if it does, that raises another weirdness: we fall back to kwset transparently when a grep pattern contains no metacharacters. So you would get different results for "-i --grep=é" versus "-i --grep=é.*". Of course, even if we used a locale-respecting version of tolower in the original code, I suspect that a byte table would be fundamentally insufficient, anyway, in the face of multi-byte encodings like utf8. So I don't think your patch is making the problem any worse. And even if somebody wants to tackle the problem later, the solution would look so unlike the original code that your change is not hurting their effort. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html