On Wed, Jan 23, 2008 at 08:16:33AM -0800, Linus Torvalds wrote: > > > On Wed, 23 Jan 2008, Theodore Tso wrote: > > > > So this demonstrates that on my MacOS 10.4.11 system, on NFS, MacOS is > > doing no normalization, as it is creating two files. On HFS+, MacOS > > is mapping both filenames to the same decomposed name. > > Well, it demonstrates that (a) the OS and (b) _perl_ don't mangle > filenames on non-HFS+ filesystems. Well "touch" actually since that was what was actually creating the files; I only used perl because it was easist way to gaurantee exactly how the filenames would be generated. > The problem is that since most native applications *expect* that name > mangling, they'll probably do name mangling of their own (internally) just > to compare the names! > > So I would not be surprised if the globbing libraries, for example, will > do NFD-mangling in order to glob "correctly", so even programs ported from > real Unix might end up getting pathnames subtly changed into NFD as part > of some hot library-on-library action with UTF hackery inside. It's worse than that. You can specify at format time whether or not HFS+ does case-sensitivity or not, and of course, there is UFS, which I expect does no Unicode normalization at all, much like NFS. I suspect what you've pointed out is why certain MacOS programs break horribly when run on non-HFS+ filesystems, though. And if that is the case, then those same programs might not be reliable if the user's home directory is stored on NFS --- like they would be in an enteprise/corproate environment, if Apple ever wants to have any hope of penetrating that market. Because of this, git code won't be able to just check for HFS+; it will probably have to do a run-time test to see whether or not the filesystem is doing case-folding or not, since that can be turned on or off on a per-filesystem basis. Also unknown, and which should be tested, is whether turning off case-folding also turns off Unicode normalization. It may be that they did this so that HFS+ could be UFS compatible, since Darwin *must* be built on a UFS filesystem, reflecting its Mach/BSD heritage. (I ran across this while doing my web research; apparently HFS+ has been causing Apple headaches internally. Heh. :-) >Things like the finder etc, which must be very aware of the fact that >filenames get corrupted, would presumably internally always convert >everything they get into NFD in order to compare names from different >sources. And as part of that, programs may well corrupt the name before >they then use it to create a pathname. Well, hopefully not everyone inside Apple's OS groups are total morons, and actually use a utf8_str_equiv() routine instead of strcmp() to do their Unicode comparisons. But then again, maybe not... > The fact that your perl program works under NFS, but creates NFD on a VFAT > volume, does imply that they probably used at least some of the same > routines they use in HFS+ for VFAT. Not entirely surprising: doing case > insensitive stuff with Unicode is nasty code, so why not share it (even if > it's then incorrect for FAT).. > > Piece of crap it is, though. Apple has painted themselves into a nasty > corner there. No kidding!! - Ted - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html