Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 21 Jan 2008, Kevin Ballard wrote:

> On Jan 21, 2008, at 9:14 AM, Peter Karlsson wrote:
> 
> > I happen to prefer the text-as-string-of-characters (or code points,
> > since you use the other meaning of characters in your posts), since I
> > come from the text world, having worked a lot on Unicode text
> > processing.
> > 
> > You apparently prefer the text-as-sequence-of-octets, which I tend to
> > dislike because I would have thought computer engineers would have
> > evolved beyond this when we left the 1900s.
> 
> I agree. Every single problem that I can recall Linus bringing up as a
> consequence of HFS+ treating filenames as strings is in fact only a problem if
> you then think of the filename as octets at some point. If you stick with
> UTF-8 equivalence comparison the entire time, then everything just works.
> 
> Granted, this is a problem when you have to operate on a filesystem that
> thinks of filenames as octets, but as I said before, this doesn't mean the
> HFS+ approach is wrong, it just means it's incompatible with Linus's approach.

Linus' approach is _FAST_.

Why do you think Git has now acquired a reputation of kicking asses all 
around the SCM scene?

The HFS+ approach might be fine if you think of it in terms of "the user 
will be awfully confused if two file names are shown identically in the 
File Open dialog box".  But it otherwise sucks big time when it comes to 
high performance applications needing to deal with a huge amount of file 
names at once.

Normalization will always hurt performances.  This is an overhead.  
Sometimes that overhead might be insignificant and not be perceptible, 
but sometimes it is.  And Git is clearly in the later case. Performances 
will be hurt big time the day it is made aware of that normalization. 
This is why there is so much resistance about it, especially when the 
benefits of normalizing file names are not shown to be worth their cost 
in performance and complexity, as other systems do rather fine without 
it.


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux