Re: Re-casing directories on case-insensitive systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sat, 12 Jan 2008, Dmitry Potapov wrote:
> 
> After cursory look at the source code, I wonder if converting name1
> and name2 to upper case before memcmp in cache_name_compare() can
> help case-insensitive systems. This change will change the order of
> file names in the index, but I suppose that it should not be a problem,
> because the index is host specific. Though, this fix is too simple, so
> I guess, I missed something.

No, the index isn't host-specific, and we also have a deep knowledge of 
the fact that the index order is the same as the unpacked tree order.

So no, we absolutely cannot just sort the index differently. We literally 
need to have a separate key for a "upper case lookup".

(That separate key can be just a hash table - it doesn't need to be 
something you can iterate over, so it can be pretty simple).

> > (And that's totally ignoring the fact that case-insensitivity then also 
> > has tons of i18n issues and can get *really* messy 
> 
> The proper support of i18n is not simple even without case-insensitivity.
> For instance, there are four different encodings widely used for Russian
> letters.

.. and git is very clear about this: filenames are *not* "characters" in 
the i18n sense, they are series of bytes. There is absolutely no room for 
ambiguity, and there is no locale for those things.

And that isn't going to change. It's the only sane way to do 
locale-independent names: people can *choose* to see the filenames as some 
UTF-8 sequence, or a series of Latin1, or anything, but that's not 
something git itself will care about.

Trying to involve locale in name comparison simply isn't possible. Two 
different repositories on two different filesystems would get two 
different answers. And that is simply unacceptable in a distributed 
system.

What we can do is to make the simple cases (ie the locale-*independent* 
ones) warn about problems with case insensitivity.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux