Re: Re-casing directories on case-insensitive systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 11, 2008, at 7:15 PM, Robin Rosenberg wrote:

lördagen den 12 januari 2008 skrev Kevin Ballard:
Speaking of normalizing composed sequences, could that be the cause
for the following?
[...]
kevin@KBALLARD:~/Dev/git/gitweb/test> ls Märchen | xxd
0000000: 4d61 cc88 7263 6865 6e0a                 Ma..rchen.

As you can see, git has the file tracked using M\303\244rchen, where
\303\244 (or 0xC3A4, or U+00E4) is Latin Small Letter A With
Diaeresis, but the filesystem reports it as "Ma\xCC\x88rchen" where
0xCC88 (or U+0308) is Combining Diaeresis.

Yes that is due to normalization. When adding a file by name git uses the user supplied name, but when adding files indirectly it gets the names from the file system without denormalizing them. Likewize status gets the names from
the file system without denormalizing and thus you get a mismatch.

Is there a reason for this? It seems like it would be trivial to end up with misdiagnosed "untracked" files when using any language other than English given this behaviuor.

-Kevin Ballard

--
Kevin Ballard
http://kevin.sb.org
kevin@xxxxxx
http://www.tildesoft.com


<<attachment: smime.p7s>>


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux