On Mon, 4 Dec 2006, Linus Torvalds wrote: > > If a filesystem cannot represent that name AS THAT BYTE SEQUENCE then the > filesystem is broken. No ifs, buts, maybes about it. I'm sorry, but that's > how it is. Btw, what this means in practice is that when git creates a file with a certain sequence of bytes, then (a) readdir had better return _that_ sequence of bytes, or git will see it as somethign else. (b) opening it with that same sequence of bytes had better work. This does not mean that a filesystem may not internally use some other encoding. It just means that if the filesystem - when converting back and forth between the internal encoding and the one it shows to user space - had better convert back to the exact same thing. Also, note that for most projects, even a broken filesystem doesn't actually matter - it's enough that the filesystem gets the conversions right for the particular set of names in a particular project. So any project that just has 7-bit filenames will obviously never even see any issues at all, even if the filesystem it runs on then does something strange with 8-bit filenames. This is one reason why UNIX's "everything is a stream of bytes" is so important, and whyprograms should generally work with byte streams, not "wide strings" or similar. It's the only way that you can reliably work across different locales. Use wide strings and locale-specific stuff _only_ for actually showing users something on the tty, for example. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html