Re: git on MacOSX and files with decomposed utf-8 file names

David Kastrup <dak@xxxxxxx> · Mon, 21 Jan 2008 22:05:51 +0100

Kevin Ballard <kevin@xxxxxx> writes:

> On Jan 21, 2008, at 3:43 PM, Dmitry Potapov wrote:
>
>> On Mon, Jan 21, 2008 at 11:59:24AM -0500, Kevin Ballard wrote:
>>>
>>> No, it's a question of hashing algorithm. And it's one that's fairly
>>> easily solved simply by picking a specific nonambiguous UTF-8
>>> encoding before hashing.
>>
>> UTF-8 is a *single* encoding, and it maps every Unicode character to
>> a unique binary representation. So, it is completely nonambiguous.
>
> In this case, encoding refers to normalization form, as other people
> have used it in the conversation besides me.

There exists more than one "normalization form" (even across MacOS
platforms), and git is cross-platform.  And people can't be made to
agree on normalization forms, anyway.  You are aware that Unicode code
points are shared between some Chinese and Japanese signs, and that
stroked forms might be composed differently in different languages?  We
don't need to go to the Far East, anyway: in Turkish, İ and i are
equivalent, as are I and ı, whereas in other European languages, I is
instead equivalent to i.  In the Netherlands, ÿ is IIRC equivalent to
ij.  And so on.

> I suggest you stop trying to find inconsequential stuff to argue
> about, especially when a tiny bit of critical thinking would reveal
> the answer.

Now that you have established that you are the only person on the list
capable of critical thinking, how about going elsewhere where you can
find similarly sharp critical thinkers?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html