Lars Noschinski <lars-2008-2@xxxxxxxxxxxxxxxxxxxx> writes: > * Peter Krefting <peter@xxxxxxxxxxxxxxxx> [09-03-03 12:54]: > > Lars Noschinski: > > >Changing the filename (on checkout), so that the user sees an Ü regardless of > > >his or her locale (instead of an \0xDC, which only resolves to an Ü on > > >latin-1) would be an absolutely broken concept here. > > > > Why would it? It is my view as a user on my files that define how file names > > are looked upon. If I have three machines, one Linux box using a iso8859-1 > > locale, an OS X box (where, I would believe, file APIs use UTF-8, someone > > please correct me if I'm wrong), and a Windows box (which uses UTF-16 on the > > file system layer, but does provide compatibility functions that use char > > pointers), and create a file on each of these called "Ü.txt" (which would be > > the sequence "DC 2E 74 78 74" on the Linux box, "C3 9C 2E 74 78 74" (or > > probably something else since I believe OS X decomposes the string) on the OS X > > box and "00DC 002E 0074 0078 0074" on the Windows box, I see these three file > > names as equal. > > Because a function in the source code refers to (e.g.) "DC 2E 74 78 74", > not "C3 9C 2E 74 78 74" nor "00DC 0024 0074 0078 0074". And it does so > regardless of the locale. The only actual language I know where I've seen people use non-ascii names for referenced files, i.e. classes, is Java and there you specify the encoding to the compiler. Class names are not byte sequences there. XML files are another case where references files are defined in unicode. I assume this applies to C# and other modern languages too. > The file name may look funny depending on your locale, but if you rename > the file to fit your local enconding, it would not work. In the Java case, you /have/ to "rename" or the build will break. Build systems like Ant or Maven require you to "rename" too regardless of what you build. A C Git clone will produce unbuildable code, but JGit will produce a working one for unicode aware systems and documentation, the case where unicode filenames are more common than in source, will look good. -- robin PS. I readded the people you forgot to Cc -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html