Karsten Blees <karsten.blees@xxxxxxxxx> writes: > diff --git a/Documentation/i18n.txt b/Documentation/i18n.txt > index e9a1d5d..e5f6233 100644 > --- a/Documentation/i18n.txt > +++ b/Documentation/i18n.txt > @@ -1,18 +1,28 @@ > -At the core level, Git is character encoding agnostic. > - > - - The pathnames recorded in the index and in the tree objects > - are treated as uninterpreted sequences of non-NUL bytes. > - What readdir(2) returns are what are recorded and compared > - with the data Git keeps track of, which in turn are expected > - to be what lstat(2) and creat(2) accepts. There is no such > - thing as pathname encoding translation. > +Git is to some extent character encoding agnostic. I do not think the removal of the text makes much sense here unless you add the equivalent to the new text below. > - The contents of the blob objects are uninterpreted sequences > of bytes. There is no encoding translation at the core > level. > > - - The commit log messages are uninterpreted sequences of non-NUL > - bytes. > + - Pathnames are encoded in UTF-8 normalization form C. This That is true only on some systems like OSX (with HFS+) and Windows, no? BSDs in general and Linux do not do any such mangling IIRC. I am OK with mangling described as a notable oddball to warn users, though; i.e. not as a norm as your new text suggests but as an exception. > + platforms. If file system APIs don't use UTF-8 (which may be > + file system specific), it is recommended to stick to pure > + ASCII file names. Hmph, who endorsed such a recommendation? It is recommended to stick to whatever naming scheme that would not cause troubles to project participants. If your participants all want to (and can) use ISO-8859-1, we do not discourage them from doing so. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html