Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 21, 2008, at 4:18 PM, Theodore Tso wrote:

On Mon, Jan 21, 2008 at 03:58:03PM -0500, Kevin Ballard wrote:
You're making the huge assumption that the HFS+ normalization algorithms
will change. As the technote states:

"Platform algorithms tend to evolve with the Unicode standard. The HFS Plus algorithms cannot evolve because such evolution would invalidate existing
HFS Plus volumes."

Great, so even worse.  Does the tech note then specify exactly what
version of Unicode HFS+ is using to do its "normalization"?  Or
exactly what characters it will normalize?  After all, Unicode has
added all sorts of characters since 1998, and I'm sure some of them
were combining characters.

And you *really* want to continue argue that a sane thing for a
cross-platform system to do is to pervert its hash algorithm to take
into account *one* particular OS that happened to freeze a
normalization algorithm at some arbitrary point in time, approximately
nine years ago?  Talk about the tail wagging the dog!!  Especially
when you can't even justify why it was done nine years ago!

I suggest you go back and read the emails where I specifically stated that I'm *not* suggesting this.

It must have bought somebody something, or they never would have done it.

Your faith in the HFS+ designers is touching.

And your arrogance is troubling. Do you really believe you are so smart you can claim the HFS+ designers had no reason for this decision?

I have no idea why HFS+ stores filenames in a normalized form, and further
I am smart enough to know that speculating is completely pointless. I
assume the authors had a good reason (which should be a safe assumption, filesystem authors are a smart bunch). The reason may not be valid anymore, but if it was valid back in 1998, then I can accept it without complaining.

Well, I *AM* a filesystem designer (ext2/ext3/ext4), and well before
1998, I knew that trying to do anything with Unicode normalization was
a fool's errand.  So if you're going to blindly trust filesystme
designers (not something I would recommend, actually :-), trust me.
What HFS+ is doing is dumb, dumb, dumb.

Again, I'm not saying that they necessarily did the "correct" thing, as I can't evaluate that without knowing their reason. I'm just saying there must have been a reason.

And even if *you* can accept it, why should the git designers pervert
any core part of git's design to support this behaviour?  Especially
if it's legacy behaviour which will hopefully be going away, say when
MacOS adopts ZFS --- there's an opportunity for them to start afresh,
and not make the same mistakes they made nine years ago!

And why do you believe MacOS is going to adopt ZFS? Sure, they might, but assuming stuff about the future is just as bad as assuming stuff about the past. And git should "pervert" itself because of the simple fact that git has a problem on HFS+. Keeping your code "pure" is all well and good, except it's not particularly practical. If the git project has any interest in being a viable system on OS X, it really should behave properly. I'm sure you have various "perversions" for other cases.

So why don't you suggest some kind of sane fix in the Mac specific
code that doesn't impact any core part of git, such as its hash
algorithm?  It would be far more productive than trying to defend a
bad design decision made nine years ago....   :-}

How many times must I say I never suggested actually changing git's hashing algorithm? And if you want me to suggest a fix to git that works, first you have to wait for me to learn how git's internals work, and frankly, I have too much work on my plate right now to devote the time necessary to learning git's internals well enough to fix this problem.

-Kevin Ballard

--
Kevin Ballard
http://kevin.sb.org
kevin@xxxxxx
http://www.tildesoft.com


<<attachment: smime.p7s>>


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux