On Jan 21, 2008, at 6:44 PM, Linus Torvalds wrote:
On Mon, 21 Jan 2008, Kevin Ballard wrote:I find it amusing that you keep arguing against having git treat filenames asunicode whenNO I DO NOT! Dammit, stop this idiocy.I think it's fine having git treat filenames "as unicode", as long as youdon't do any munging on it.
When I say "treat filenames as unicode" I'm implying the equivalence comparisons and everything else that we've been talking about.
Why? Because if it's utf-8, then treating them "as unicode" means exactlythe same as treating them "as a user-specified string".
If that's what "as unicode" meant, then the phrase "as unicode" has zero meaning.
So stop lying about this whole thing. I have never *ever* argued againstunicode per se.
No, you've argued against unicode equivalency in filenames. Can't you figure out, when the entire time I've been talking about equivalency, that I'm *still* talking about equivalency?
All my complaints - every single one of them - comes down to making theidiotic choice of trying to munge those strings (not even strictly "normalize") into something they are not.
Yes, I understand quite well that you are against munging strings.
And what you don't seem to understand is that once you accept _unmodified_ raw UTF-8 as a good unicode transport mechanism, suddenly other encodingsare possible. I'm not out to force my world-view on users. If they areusing legacy encodings (whether in filenames *or* in commit texts or intheir file contents), that's *their* choice.
You're not using raw UTF-8, you're just using raw bytes. Calling it UTF-8 doesn't mean anything, since you don't actually know that's what it is. But this is fairly irrelevant.
I actually personally happen to use UTF-8-encoded unicode.I'm just not stupid enough to think that (a) corrupting it is a good idea, *or* (b) that I should force every Asian installation of git to also forcepeople to use unicode (or even having all the conversion libraries and overheads!) So stop this idiotic "unicode == normalization" crap.I'm a huge fan of UTF-8. But that does not mean that I think normalizationis a good idea.
How many times must I say the same thing over and over? I'm not arguing that forced normalization is a good thing. I'm arguing that, in a system which is unicode-aware top to bottom, forced normalization is irrelevant to the user, since they don't care about the exact byte sequence. And I'm also arguing that git should have some solution to this problem. I find it interesting that you're perfectly happy to rant and rail against your misperception of my argument, and yet you consistently and repeatedly ignore my offers to stop this argument and work towards a solution, as well as my comments on existing proposed solutions.
Are you even reading to the end of my emails? - Kevin Ballard -- Kevin Ballard http://kevin.sb.org kevin@xxxxxx http://www.tildesoft.com
<<attachment: smime.p7s>>