On Jan 11, 2008, at 7:15 PM, Robin Rosenberg wrote:
lördagen den 12 januari 2008 skrev Kevin Ballard:Speaking of normalizing composed sequences, could that be the cause for the following?[...]kevin@KBALLARD:~/Dev/git/gitweb/test> ls Märchen | xxd 0000000: 4d61 cc88 7263 6865 6e0a Ma..rchen. As you can see, git has the file tracked using M\303\244rchen, where \303\244 (or 0xC3A4, or U+00E4) is Latin Small Letter A With Diaeresis, but the filesystem reports it as "Ma\xCC\x88rchen" where 0xCC88 (or U+0308) is Combining Diaeresis.Yes that is due to normalization. When adding a file by name git uses the user supplied name, but when adding files indirectly it gets the names from the file system without denormalizing them. Likewize status gets the names fromthe file system without denormalizing and thus you get a mismatch.
Is there a reason for this? It seems like it would be trivial to end up with misdiagnosed "untracked" files when using any language other than English given this behaviuor.
-Kevin Ballard -- Kevin Ballard http://kevin.sb.org kevin@xxxxxx http://www.tildesoft.com
<<attachment: smime.p7s>>