Hi, On Thu, 24 Jan 2008, Junio C Hamano wrote: > [A nice, concise, well written and obviously thought-through summary of > the case sensitivity and UTF-8 file name issues.] Thank you Junio. It must have taken much more time than just sitting down and hacking into the keyboard. By this thinking before writing, you invested some time that you save all the readers, including me. I appreciate that very much. > [Goes on to describe what we do with symlinks when the filesystem is not > capable of representing symlinks; compares that situation to the > filenames situation.] There is a fundamental difference between the symlinks situation and the filename situation that you should keep in mind: even if the filesystem cannot create symlinks, the nature of filenames as unique keys is not changed. You cannot have a symlink and a file of the same name. In a way, it takes away a degree of freedom of the _values_ that the _keys_ point to. The same is not true for the case-challenged filesystems; they change the nature from unique keys to semi-unique keys. So while other filesystems can discern all different keys, these challenged filesystems cannot; they take away a degree of freedom of the _keys_. It is much easier to cope with the lack of degree of freedom in values; you have to store the metadata somewhere else -- in this case the index -- but it is still easily accessible by the key. But that is not possible if two different _keys_ are not accepted as different by the filesystem. You can still store the different metadata in the index, but the _content_ cannot be in the filesystem under the desired keys; not at the same time, anyway. > Perhaps we could have something like: > > $ git show :xt_CONNMARK.c >xt_connmark-1.c > $ edit xt_connmark-1.c > $ git add --as xt_CONNMARK.c xt_connmark-1.c Something similar is already possible: $ git checkout xt_CONNMARK.c $ edit xt_CONNMARK.c $ git add xt_CONNMARK.c but you have to keep in mind that - "git add -u" or "git commit -a" is a no-no-no, and - the system will not build, no matter what you change in git on those filesystems. Having said that, I think that a config variable/commit hooks for those repositories which _happen_ to live on sane filesystems, but have to be checked out on challenged ones, makes absolute sense. (The commit hook is possible already, but less efficient than the config variable.) > If it is a new file, we won't find any name that is equivalent to $A in > the index, and we use the name $A obtained from readdir(3). > > BUT with a twist. > > If the filesystem is known to be inconveniently case folding, we are > better off registering $B instead of $A (assuming we can convert from $A > to $B). I tend to agree with Nico. We should not "learn" from the challenged filesystems. > Tasks > ----- > > - Identify which case folding filesystems need to be supported, > and make sure somebody understands its folding logic; > > - For each supported case folding logic, these are needed: > > - a hash function that throws "equivalent" names in the same > bucket, to be used in Linus's patch; AFAIR Linus wanted to have one has function to rule them all. That would be way cool, since it means fewer possibilities for bugs to go undetected. > - a compare function to determine equivalent names; AFAICT we need three functions: strcasecmp(), utf8_strcmp() and utf8_strcasecmp(). Although I might be wrong, and the second is not needed. Probably the answer for this has been buried in many, many lines that I decided not to read. Maybe I'll ask Randal on IRC, he's usually very quick to give me reasonable and concise answers. And then we trash-talk a little, just for fun. Ciao, Dscho - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html