Re: I'm a total push-over..

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 23 Jan 2008 09:09:46 -0800 (PST)

On Wed, 23 Jan 2008, Johannes Schindelin wrote:
>
> > End result: practically all projects will never notice anything at all for 
> > 99.9% of all files. One extra well-predicted branch, and a few more hash 
> > collissions for cases where you have both "Makefile" and "makefile" etc.
> 
> Well, that's the point, to avoid having both "Makefile" and "makefile" in 
> your repository when you are on case-challenged filesystems, right?

Right. But what I'm saying is that this is *really* cheap to test for for 
US-ASCII-only characters, and if only 0.1% of all filenames have unicode 
in them, the fact that they are much mroe expensive isn't even going to be 
noticeable. Except for some very odd-ball environments.

> > It's quite possible to do
> > 
> > 	utf8_nfd_strcmp(a,b)
> > 
> > and (a) do it tons and tons faster and (b) never have to modify the 
> > strings themselves. Same goes (even more) for hashing.
> 
> Okay.  Point taken.

Note that one reason the above is tons faster is that even with complex 
unicode, the *common* case is going to be that the names match with a 
binary compare.

> But I really hope that you are not proposing to use the case-ignoring 
> hash when we are _not_ on a case-challenged filesystem...

I actually suspect that we could, and nobody will notice. The hash would 
cause a few more collissions, but not so you'd know.

And the thing is, people who work with other people who are on 
case-challenged systems would still want to have the case-insenstive 
compare too - although it should just warn, not actually "work".

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html