Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 21, 2008, at 11:48 AM, David Kastrup wrote:

Kevin Ballard <kevin@xxxxxx> writes:

On Jan 21, 2008, at 9:14 AM, Peter Karlsson wrote:

I happen to prefer the text-as-string-of-characters (or code points,
since you use the other meaning of characters in your posts), since I
come from the text world, having worked a lot on Unicode text
processing.

You apparently prefer the text-as-sequence-of-octets, which I tend to
dislike because I would have thought computer engineers would have
evolved beyond this when we left the 1900s.

I agree. Every single problem that I can recall Linus bringing up as a
consequence of HFS+ treating filenames as strings is in fact only a
problem if you then think of the filename as octets at some point. If
you stick with UTF-8 equivalence comparison the entire time, then
everything just works.

git calculates hashes over filenames and sorts them. This is not a mere
question of "UTF-8 equivalence comparison".

No, it's a question of hashing algorithm. And it's one that's fairly easily solved simply by picking a specific nonambiguous UTF-8 encoding before hashing.

Granted, this is a problem when you have to operate on a filesystem
that thinks of filenames as octets,

It also is a problem when operating on a filesystem that considers "ä" a
single utf-8 character instead of decomposing it.

What makes you say that?

but as I said before, this doesn't mean the HFS+ approach is wrong, it
just means it's incompatible with Linus's approach.

It is not the business of a file system to juggle with filename
representations.

You're right, that probably belongs in the VFS layer, but the behavior is the same either way. You can't leave it up to user-space libraries to enforce a filesystem encoding, because you can't rely on all clients to behave properly.

-Kevin Ballard

--
Kevin Ballard
http://kevin.sb.org
kevin@xxxxxx
http://www.tildesoft.com


<<attachment: smime.p7s>>


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux