Re: Re-casing directories on case-insensitive systems

Dmitry Potapov <dpotapov@xxxxxxxxx> · Sat, 12 Jan 2008 17:46:29 +0300

On Fri, Jan 11, 2008 at 02:08:35PM -0800, Linus Torvalds wrote:
> 
> However, it's not like there is even a simple solution. The right place to 
> do that check would probably be in "add_index_entry()", but doing a check 
> whether the same file already exists (in a different case) is simply 
> *extremely* expensive for a very critical piece of code, unless we were to 
> change that index data structure a lot (ie add a separate hash for the 
> filenames).

After cursory look at the source code, I wonder if converting name1
and name2 to upper case before memcmp in cache_name_compare() can
help case-insensitive systems. This change will change the order of
file names in the index, but I suppose that it should not be a problem,
because the index is host specific. Though, this fix is too simple, so
I guess, I missed something.

> (And that's totally ignoring the fact that case-insensitivity then also 
> has tons of i18n issues and can get *really* messy 

The proper support of i18n is not simple even without case-insensitivity.
For instance, there are four different encodings widely used for Russian
letters. On Windows alone, you have two simulteniously in the default
settings -- Windows-1251 for Windows applications and CP866 for Console
applications... Actually, some console applications can change its default
encoding, and it seems Cygwin programs do that. So, based on whether you
use gcc from Cygwin or Visual C to compile your console program, you can
get a different encoding. On *nix in Russia, koi8-r and utf-8 are most
popular... So, if you have a repository shared between different systems,
you cannot think about a file name just as a sequence of bytes anymore.
OTOH, I doubt that many people are really interested in using non-ASCII
file names with Git right now.

Dmitry
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html