On Mon, 4 Dec 2006, Johannes Schindelin wrote: > > On Mon, 4 Dec 2006, Jakub Narebski wrote: > > > Johannes Schindelin wrote: > > > > > On Mon, 4 Dec 2006, Jakub Narebski wrote: > > > > > >> [...] git should acquire core.filesystemEncoding configuration variable > > >> which would encode from filesystem encoding used in working directory > > >> and perhaps index to UTF-8 encoding used in repository (in tree objects) > > >> and perhaps index. > > > > > > So, you want to pull in all thinkable encodings? Of course, you could rely > > > on libiconv, adding yet another dependency to git. (Yes, I know, mailinfo > > > uses it already. But I never use mailinfo, so I do not need libiconv.) > > > > A conditional dependency. If you don't have libiconv, this feature wouldn't > > be available. > > You are speaking as somebody compiling git from source. We are a minority. You guys are ignoring the _real_ problem. It has nothing at all to do with dependencies on external packages. The REAL problem is that if you do locale-dependent trees and other git objects, git will STOP WORKING. A filename in a tree object _has_ to be see as a pure 8-bit character stream. They _have_ to be compared with "memcmp()", and they have to sort the same way and mean EXACTLY the same thing for everybody. If a filesystem cannot represent that name AS THAT BYTE SEQUENCE then the filesystem is broken. No ifs, buts, maybes about it. I'm sorry, but that's how it is. This is _exactly_ the same issue as case independence. Git does not ignore case, and it really CANNOT ignore case. Ignoring case would cause horrible and deep problems, and it has nothing to do with dependencies on libraries (although it _would_ get much much worse from locale settings, and again having different locales compare the same name differently because case rules are different). So it really boils down to one one: git saves a byte stream. Not text. This is true for all levels of the git archive. It's true for blob content, it's true for filenames in trees, and it is true for commits. The commit message is actually somewhat easier (because we have nothing to "compare" it to afterwards in the checked-out tree), so the commit message is the _one_ thing we can kind of play games with, but even there, once it's done, it's done, and it's just a stream of bytes. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html