Hi, On Tue, 12 May 2009, Esko Luontola wrote: > On 12.5.2009, at 19:13, Johannes Schindelin wrote: > >As to storing all file names in UTF-8, my point about Unicode being not > >necessarily appropriate for everyone still stands. > > > >UTF-8 _might_ be the de-facto standard for Linux filesystems, but IMHO > >we should not take away the freedom for everybody to decide what they > >want their file names to be encoded as. > > > >However, I see that there might be a need to be able to encode the file > >names differently, such as on Windows. IMHO the best solution would be > >a config variable controlling the reencoding of file names. > > Exactly. The system should not force the use of a specific encoding. It > should only offer a recommendation, but be also fully compatible if the > user uses some other encoding. > > That's why it's best to always store the information about what encoding > was used. It shouldn't matter, whether the data is encoded with > ISO-8859-1, UTF-8, Shift_JIS, Big5 or some other encoding, as long as it > is explicitly said that what the encoding is. Then the reader of the > data can best decide, how to show that data on the current platform. > > A config variable for defining, that what encoding should be used when > committing the file names, would make sense. Git should also try to > autodetect, that what encoding is used in its current environment. In > the case of UTF-8, you should also be able to specify which > normalization form is used > (http://www.unicode.org/unicode/reports/tr15/), or whether it is > normalized at all. > > For example, it should be possible to configure Git so, that when a file > is checked out on Mac, its file name is converted to the current file > system's encoding (UTF-8 NFD, I think), and when the file is committed > on Mac, the file name is normalized back to the same UTF-8 form as is > used on Linux (UTF-8 NFC). > > It would be nice to have config variables for saying, that all file > names in this repository must use UTF-8 NFC, and all commit messages > must use UTF-8 NFC (with Unix newlines). Then the Git client would > autodetect the current environment's encoding, and convert the text, if > necessary, to match the repository's encoding. That is a nice analysis. How about implementing it? Ciao, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html