On Sun, 20 Jan 2008, Mike Hommey wrote: > > That said, the locale doesn't necessarily express the language in which > the document is written. .. and quite commonly, there are multiple languages per document. The good news is that sorting is almost never relevant or done over general documents. You sort almost only well-behaved data, and quite often the exact order is less than important: and when it is, you have very specific rules (which probably seldom have anything what-so-ever to do with general unicode ;). > It's easy enough to read documents that are not > written in your native language on the net. That's already what we are both > doing right now. Fortunately, HTTP and HTML have ways to indicate the > language in which a document is written in, but that leaves out plain > mail, for instance. Well, Unicode already handles the "reading" part, just not the sorting. > That said, the "decomposed" version of UTF-8 has nice side effects on > OSX, with UTF-8 encoded RockRidge ISO-9660 volumes (with or without > Joliet ; OSX will use RockRidge by default when it's there), for instance. I think Unicode in general (and UTF-8 in particular) is a great thing. I do not argue against Unicode at all. It's what I use myself. The thing I argue against is that they force normalization (and then, as a secondary complaint, their insane choice of target format). Linux is generally UTF-8 too, and does all of this much better. No forced normalization, and it uses UTF-8 everywhere as the encoding model. Joliet and RR works beautifully. (I don't think RR is NFD, btw. It's the standard microsoft UTF-16 without normalization, afaik. I think you can happily generate a Rock Ridge disk that has two _different_ filenames that OS X cannot tell apart, but that both Linux and Windows can see peoperly) Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html