Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 21, 2008 at 10:12:01AM -0800, Linus Torvalds wrote:
> 
> 
> On Mon, 21 Jan 2008, Kevin Ballard wrote:
> > On Jan 21, 2008, at 9:14 AM, Peter Karlsson wrote:
> > > 
> > > I happen to prefer the text-as-string-of-characters (or code points,
> > > since you use the other meaning of characters in your posts), since I
> > > come from the text world, having worked a lot on Unicode text
> > > processing.
> > > 
> > > You apparently prefer the text-as-sequence-of-octets, which I tend to
> > > dislike because I would have thought computer engineers would have
> > > evolved beyond this when we left the 1900s.
> > 
> > I agree. Every single problem that I can recall Linus bringing up as a
> > consequence of HFS+ treating filenames as strings [..]
> 
> You say "I agree", BUT YOU DON'T EVEN SEEM TO UNDERSTAND WHAT IS GOING ON.
> 
> The fact is, text-as-string-of-codepoints (let's make the "codepoints" 
> obvious, so that there is no ambiguity, but I'd also like to make it clear 
> that a codepoint *is* how a Unicode character is defined, and a Unicode 
> "string" is actually *defined* to be a sequence of codepoints, and totally 
> independent of normalization!) is fine.
> 
> That was never the issue at all. Unicode codepoints are wonderful.
> 
> Now, git _also_ heavily depends on the actual encoding of those 
> codepoints, since we create hashes etc, so in fact, as far ass git is 
> concerned, names have to be in some particular encoding to be hashed, and 
> UTF-8 is the only sane encoding for Unicode. People can blather about 
> UCS-2 and UTF-16 and UTF-32 all they want, but the fact is, UTF-8 is 
> simply technically superior in so many ways that I don't even understand 
> why anybody ever uses anything else.

Maybe because it's 1.5 times bigger for any text in chinese, japanese or
korean ?

Mike
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux