On 12.5.2009, at 19:13, Johannes Schindelin wrote:
As to storing all file names in UTF-8, my point about Unicode being
not
necessarily appropriate for everyone still stands.
UTF-8 _might_ be the de-facto standard for Linux filesystems, but
IMHO we should not take away the freedom for everybody to decide
what they
want their file names to be encoded as.
However, I see that there might be a need to be able to encode the
file
names differently, such as on Windows. IMHO the best solution would
be
a config variable controlling the reencoding of file names.
Exactly. The system should not force the use of a specific encoding.
It should only offer a recommendation, but be also fully compatible if
the user uses some other encoding.
That's why it's best to always store the information about what
encoding was used. It shouldn't matter, whether the data is encoded
with ISO-8859-1, UTF-8, Shift_JIS, Big5 or some other encoding, as
long as it is explicitly said that what the encoding is. Then the
reader of the data can best decide, how to show that data on the
current platform.
A config variable for defining, that what encoding should be used when
committing the file names, would make sense. Git should also try to
autodetect, that what encoding is used in its current environment. In
the case of UTF-8, you should also be able to specify which
normalization form is used (http://www.unicode.org/unicode/reports/
tr15/), or whether it is normalized at all.
For example, it should be possible to configure Git so, that when a
file is checked out on Mac, its file name is converted to the current
file system's encoding (UTF-8 NFD, I think), and when the file is
committed on Mac, the file name is normalized back to the same UTF-8
form as is used on Linux (UTF-8 NFC).
It would be nice to have config variables for saying, that all file
names in this repository must use UTF-8 NFC, and all commit messages
must use UTF-8 NFC (with Unix newlines). Then the Git client would
autodetect the current environment's encoding, and convert the text,
if necessary, to match the repository's encoding.
- Esko
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html