On Mon, Jan 21, 2008 at 02:05:51PM -0500, Kevin Ballard wrote: > > > >But that is *entirely* a separate issue from "normalization". > > > >Kevin, you seem to think that normalization is somehow forced on you > >by > >the "text-as-codepoints" decision, and that is SIMPLY NOT TRUE. > >Normalization is a totally separate decision, and it's a STUPID one, > >because it breaks so many of the _nice_ properties of using UTF-8. > > I'm not saying it's forced on you, I'm saying when you treat filenames > as text, to treat as text could mean different for different people. Some may prefer to fi and fi_ligature to be treated as same in some context. > it DOESN'T MATTER if the string gets normalized. As long as > the string remains equivalent, As matter of fact it does, otherwise characters would be the same and we would not have this conversation at all. String can be equivalent and not equivalent at the time, because there are different equivalent relations. Finally, what HFS+ does is even not normalization. In the technote, Apple explains that they decompose some characters but not others for better compatibility. So, you see, there is a PROBLEM here. > YOU DON'T CARE about the underlying > byte stream. It is not about byte stream. After all, if it were UTF-16 instead of UTF-8, it would be one to one conversion for each character. So, what gets corrupted by HFS+ are Unicode *characters*. > > Alright, fine. I'm not saying HFS+ is right in storing the normalized > version, but I do believe the authors of HFS+ must have had a reason > to do that, I don't say they do that without *any* reason, but I suppose all Apple developers in the Copland project had some reasons for they did, but the outcome was not very good... > The only information you lose when doing canonical normalization is > what the original byte sequence was. Not true. You lose the original sequence of *characters*. Dmitry - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html