Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here's a reliable test case to test filename normalization on Mac OS.

------ cut here -------
cat > test.pl << EOF
#!/usr/bin/perl -CO
print "M".pack("U",0x00E4)."rchen\n";
print "Ma".pack("U",0x0308)."rchen\n";
EOF
chmod +x test.pl
./test.pl | xargs touch
echo M* | xxd -g1
------ cut here -------

On an NFS mounted filesystem, what you will get is this:

0000000: 4d 61 cc 88 72 63 68 65 6e 20 4d c3 a4 72 63 68  Ma..rchen M..rch
0000010: 65 6e 0a                                         en.

and on an HFS+ mounted filesystem, what you will get is this:

0000000: 4d 61 cc 88 72 63 68 65 6e 0a                    Ma..rchen.

So this demonstrates that on my MacOS 10.4.11 system, on NFS, MacOS is
doing no normalization, as it is creating two files.  On HFS+, MacOS
is mapping both filenames to the same decomposed name.

More (or not) surprisingly, given Kevin Ballard's "reliable source":

  "In Mac OS X,  SMB, MSDOS, UDF, ISO 9660 (Joliet), NTFS and ZFS file
  systems all store in one form -- NFC.  We store in NFC since that what is
  expected for these files systems."

Using a Sony Reader (which uses an internal FAT filesystem) hooked up
to a MacOS 10.4.11 system:

% /fs/u1/tmp/test.pl  | xargs touch
% echo M* | xxd -g1
0000000: 4d 61 cc 88 72 63 68 65 6e 0a                    Ma..rchen.

.. which is the decomposed form.  So it looks like on FAT/MSDOS
filesystems MacOS 10.4.11 normalizes files to NFD, which will *not* do
the right thing as far as Windows compatibility is concerned on USB
sticks, et. al.  Mac OS users would be well advised not to use
non-ASCII names in their filesystems if they care about interoperating
with other systems.  :-P

							- Ted
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux