tisdag 16 oktober 2007 skrev Steffen Prohaska: > > On Oct 16, 2007, at 2:33 PM, Johannes Schindelin wrote: > > >> Maybe we need a configuration similar to core.autocrlf (which > >> controls > >> newline conversion) to control filename comparison and normalization? > >> > >> Most obviously for the case (in-)sensitivity on Windows, but I also > >> remember the unicode normalization happening on Mac's HFS filesystem > >> that caused trouble in the past. > > > > Robin Rosenberg has some preliminary code for that. The idea is to > > wrap > > all filesystem operations in cache.h, and do a filename normalisation > > first. > > At that point we could add a safety check. Paths that differ only by > case, or whitespace, or ... (add general and project specific rules > here) > should be denied. This would guarantee that tree objects can always be > checked out. Even if the filesystem capabilities are limited. > > Robin, what do you think? My code only normalizes filenames to UTF-8 inside git, which isn't the same thing. I think that can be extended to handling MacOSX normalized UTF-8 and Windows UTF-16 so, when you check out a thing from git there will be no surprises. Case insensitivity is another dimension. I have no idea as to the performance of the code, it's more like a proof-that-it-can-be-done. The code cannot "fail", it always does something reasonable, like not converting when that is not possible. Something else has to be done for validation. The UTF-16 that windows use is not a current issue because git only does local code page. Jgit, but it isn't very smart either because git doesn't say anything about filename encoding, while Windows/MacOSX/CIFS and other filesystems does. The fact that git uses eigth bit file names may also be a reason performance is slower on Windows, because the eight-bit Win32API transforms all strings and filenames to the native UTF-16 encoding on *every* system call, in and out; that's a lot of work when you do it thousands of times. If git itself did the transform it might be made smarter and more suited to git's purposes, and most importantly faster. I have no idea about the performance hit. One has to measure something. I notice a number of SCM's out there, including one with a \$\d{4} pricetag gets you into trouble if you rename a file from Foo to FOO on Windows. -- robin - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html