Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kevin Ballard <kevin@xxxxxx> writes:

> On Jan 21, 2008, at 9:14 AM, Peter Karlsson wrote:
>
>> I happen to prefer the text-as-string-of-characters (or code points,
>> since you use the other meaning of characters in your posts), since I
>> come from the text world, having worked a lot on Unicode text
>> processing.
>>
>> You apparently prefer the text-as-sequence-of-octets, which I tend to
>> dislike because I would have thought computer engineers would have
>> evolved beyond this when we left the 1900s.
>
> I agree. Every single problem that I can recall Linus bringing up as a
> consequence of HFS+ treating filenames as strings is in fact only a
> problem if you then think of the filename as octets at some point. If
> you stick with UTF-8 equivalence comparison the entire time, then
> everything just works.

git calculates hashes over filenames and sorts them.  This is not a mere
question of "UTF-8 equivalence comparison".

> Granted, this is a problem when you have to operate on a filesystem
> that thinks of filenames as octets,

It also is a problem when operating on a filesystem that considers "ä" a
single utf-8 character instead of decomposing it.

> but as I said before, this doesn't mean the HFS+ approach is wrong, it
> just means it's incompatible with Linus's approach.

It is not the business of a file system to juggle with filename
representations.

-- 
David Kastrup

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux