On Jan 16, 2008, at 11:51 PM, Martin Langhoff wrote:
On Jan 17, 2008 5:30 PM, Kevin Ballard <kevin@xxxxxx> wrote:Those of us who grew up on a case-insensitive filesystem don't findthere to be any problem with it. I can count on one hand the number ofI guess you haven't used unix tools much. The ever-popular HEAD perl utility (which does an HTTP HEAD against a URL), when installed, silently overwrites the head shell utility, which is used for all sorts of things, some even in startup scripts. Ooops! I've been hit by this more than once - and if you google for it, it hurt a lot of people.
I can imagine. However, I've never been hit by such a situation. This doesn't mean a case-insensitive filesystem is a problem per se, it means interactions between a case-insensitive and a case-sensitive filesystem can be a problem. That doesn't mean either way is "correct" it just means both don't work well together.
I like ice cream, and I like steak, but I sure don't think a mixture of steak and ice cream would go well together. Do you?
That's only true if you don't know what type of filesystem you're on. And, in the vast majority of cases (in fact, a content tracker is the only exception I can think of), it doesn't matter. If the user saidHmmm. Many important tools - that I wouldn't want to ever fail! - have similar needs to git. Backup/restore and file replication tools for example.
Both of which would be replicating the directory contents, not a listing of files specified by the user. If, as a user, I were to say "please replicate file FOO" and the file was really called "foo", I wouldn't be in the least surprised to see the tool take me at my word and produce a file called "FOO" with the contents of "foo". But in general, things like this operate on the filesystem, not on the user args.
This is why case-insensitivity is so hard: you have a very real "aliasing" on the filesystem level, where all those really *different* pathnames end up being the same thing.I don't see that as being a problem. Think of it, if you will, as if every single file simply had an implicit hardlink for every possiblecase or normalization variant. The whole point of the filename is thatOk - but how do you track the directory then (in git's terms, the tree). There's no way to tell what the user wants. Does the user want a copy of the file with different capitalization, or is the OS playing games?
If I say "track FOO", I probably mean it. So go ahead and track "FOO", even if you end up tracking the contents of file "foo". I certainly won't blame the tool for doing what I told it.
it is meta-information, used as an identifier and not as actual content, and thus it is perfectly fine for it to be a real string, subject to interpretation,I don't think you *actually* want it subject to interpretation.
Sure I do. I find it very convenient, for example, to say "cd documents/school" when I really want to go to "Documents/School". Similarly, if I'm trying to reference gitweb/tests/Märchen, I'm quite happy to not have to figure out what normalization the filename is using and attempt to replicate that (especially as I have no idea which normalization my input mechanism uses - unlike Linus, I don't have a key dedicated to ä, and even if I did I wouldn't necessarily expect it to use precomposed vs decomposed). I can't think of a single reason why I'd want to be able to have 2 different files named "Märchen" on my disk. On the other hand, treating unicode normalization as significant can pose security risks - how am I to know that the file that is named "foo.txt" is really the same file "foo.txt" that I last saw? Someone I know on IRC sent me this image[1], which shows 6 files all apparently named "foo.txt" on a disk image. This is possible because on a case-sensitive HFS+ volume, the file system doesn't ignore ignorables when comparing filenames (it does on a case-insensitive HFS+ system), and so all of those filenames look identical up until you actually pipe their names through xxd and look at the byte sequence. When this sort of tomfoolery is possible, I simply cannot trust the names of any of my files anymore.
[1]: http://sailor月.com/imgs/ignorable.png
Again, as someone who grew up in a case-insensitive world, there's noproblems here. I wish I could tell you that it causes problems, I wishI could agree with you, but I can't.Probably because you have been surrounded by tools that have a lot of extra code to cope with the case insensitive way of life, and learned to not do things that are completely valid, just to avoid trouble. Which is ok, but I don't think it makes the OS design decision
Extra code? I don't think so. The only reason I'd need extra code is if I were attempting to explicitly detect the "real" filename for a user-supplied argument, by scanning the directory contents until I found a file that was equivalent to the given argument. But there's no reason to do that. None of the code I've ever written, or any of the code I've ever seen, has had to do any extra work because it was on a case-insensitive filesystem. I contribute to a packaging system for the Mac called MacPorts, and I've never seen any patches on any of the 4000+ ports to handle case insensitivity (granted, I haven't looked at every port, but I've looked at a significant fraction). It's a complete non-issue.
The content of files is sacred. The filename is only there to provide a handle to locate the contents. I don't see any problem with expanding the equivalency scope of the filename to accept multiple encodings and cases. The only arguments I can see that have any validity at all are the ones that sound like "we use case-sensitive filesystems, and your case-insensitivity and normalization are causing problems with our tools! Conform to our world!". As I said above, this isn't a problem of case-insensitivity or normalization, it's a problem of interaction between two incompatible viewpoints. All I want to do is make git play nicer in an HFS+ world, and this would be far easier if you guys were willing to admit this is a problem that should be solved in the tool rather than a problem with the system.
-Kevin Ballard -- Kevin Ballard http://kevin.sb.org kevin@xxxxxx http://www.tildesoft.com
<<attachment: smime.p7s>>