Re: Cross-Platform Version Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12.5.2009, at 19:13, Johannes Schindelin wrote:
As to storing all file names in UTF-8, my point about Unicode being not
necessarily appropriate for everyone still stands.

UTF-8 _might_ be the de-facto standard for Linux filesystems, but
IMHO we should not take away the freedom for everybody to decide what they
want their file names to be encoded as.

However, I see that there might be a need to be able to encode the file names differently, such as on Windows. IMHO the best solution would be
a config variable controlling the reencoding of file names.

Exactly. The system should not force the use of a specific encoding. It should only offer a recommendation, but be also fully compatible if the user uses some other encoding.

That's why it's best to always store the information about what encoding was used. It shouldn't matter, whether the data is encoded with ISO-8859-1, UTF-8, Shift_JIS, Big5 or some other encoding, as long as it is explicitly said that what the encoding is. Then the reader of the data can best decide, how to show that data on the current platform.

A config variable for defining, that what encoding should be used when committing the file names, would make sense. Git should also try to autodetect, that what encoding is used in its current environment. In the case of UTF-8, you should also be able to specify which normalization form is used (http://www.unicode.org/unicode/reports/ tr15/), or whether it is normalized at all.

For example, it should be possible to configure Git so, that when a file is checked out on Mac, its file name is converted to the current file system's encoding (UTF-8 NFD, I think), and when the file is committed on Mac, the file name is normalized back to the same UTF-8 form as is used on Linux (UTF-8 NFC).

It would be nice to have config variables for saying, that all file names in this repository must use UTF-8 NFC, and all commit messages must use UTF-8 NFC (with Unix newlines). Then the Git client would autodetect the current environment's encoding, and convert the text, if necessary, to match the repository's encoding.

- Esko
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]