Kevin Ballard <kevin@xxxxxx> writes: > On Jan 21, 2008, at 3:43 PM, Dmitry Potapov wrote: > >> On Mon, Jan 21, 2008 at 11:59:24AM -0500, Kevin Ballard wrote: >>> >>> No, it's a question of hashing algorithm. And it's one that's fairly >>> easily solved simply by picking a specific nonambiguous UTF-8 >>> encoding before hashing. >> >> UTF-8 is a *single* encoding, and it maps every Unicode character to >> a unique binary representation. So, it is completely nonambiguous. > > In this case, encoding refers to normalization form, as other people > have used it in the conversation besides me. There exists more than one "normalization form" (even across MacOS platforms), and git is cross-platform. And people can't be made to agree on normalization forms, anyway. You are aware that Unicode code points are shared between some Chinese and Japanese signs, and that stroked forms might be composed differently in different languages? We don't need to go to the Far East, anyway: in Turkish, İ and i are equivalent, as are I and ı, whereas in other European languages, I is instead equivalent to i. In the Netherlands, ÿ is IIRC equivalent to ij. And so on. > I suggest you stop trying to find inconsequential stuff to argue > about, especially when a tiny bit of critical thinking would reveal > the answer. Now that you have established that you are the only person on the list capable of critical thinking, how about going elsewhere where you can find similarly sharp critical thinkers? -- David Kastrup, Kriemhildstr. 15, 44793 Bochum - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html