Re: Cross-Platform Version Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Tue, 12 May 2009, Esko Luontola wrote:

> On 12.5.2009, at 19:13, Johannes Schindelin wrote:
> >As to storing all file names in UTF-8, my point about Unicode being not 
> >necessarily appropriate for everyone still stands.
> >
> >UTF-8 _might_ be the de-facto standard for Linux filesystems, but IMHO 
> >we should not take away the freedom for everybody to decide what they 
> >want their file names to be encoded as.
> >
> >However, I see that there might be a need to be able to encode the file 
> >names differently, such as on Windows.  IMHO the best solution would be 
> >a config variable controlling the reencoding of file names.
> 
> Exactly. The system should not force the use of a specific encoding. It 
> should only offer a recommendation, but be also fully compatible if the 
> user uses some other encoding.
> 
> That's why it's best to always store the information about what encoding 
> was used. It shouldn't matter, whether the data is encoded with 
> ISO-8859-1, UTF-8, Shift_JIS, Big5 or some other encoding, as long as it 
> is explicitly said that what the encoding is. Then the reader of the 
> data can best decide, how to show that data on the current platform.
> 
> A config variable for defining, that what encoding should be used when 
> committing the file names, would make sense. Git should also try to 
> autodetect, that what encoding is used in its current environment. In 
> the case of UTF-8, you should also be able to specify which 
> normalization form is used 
> (http://www.unicode.org/unicode/reports/tr15/), or whether it is 
> normalized at all.
> 
> For example, it should be possible to configure Git so, that when a file 
> is checked out on Mac, its file name is converted to the current file 
> system's encoding (UTF-8 NFD, I think), and when the file is committed 
> on Mac, the file name is normalized back to the same UTF-8 form as is 
> used on Linux (UTF-8 NFC).
> 
> It would be nice to have config variables for saying, that all file 
> names in this repository must use UTF-8 NFC, and all commit messages 
> must use UTF-8 NFC (with Unix newlines). Then the Git client would 
> autodetect the current environment's encoding, and convert the text, if 
> necessary, to match the repository's encoding.

That is a nice analysis.  How about implementing it?

Ciao,
Dscho

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]