On Wed, 23 Jan 2008, Johannes Schindelin wrote: > > > End result: practically all projects will never notice anything at all for > > 99.9% of all files. One extra well-predicted branch, and a few more hash > > collissions for cases where you have both "Makefile" and "makefile" etc. > > Well, that's the point, to avoid having both "Makefile" and "makefile" in > your repository when you are on case-challenged filesystems, right? Right. But what I'm saying is that this is *really* cheap to test for for US-ASCII-only characters, and if only 0.1% of all filenames have unicode in them, the fact that they are much mroe expensive isn't even going to be noticeable. Except for some very odd-ball environments. > > It's quite possible to do > > > > utf8_nfd_strcmp(a,b) > > > > and (a) do it tons and tons faster and (b) never have to modify the > > strings themselves. Same goes (even more) for hashing. > > Okay. Point taken. Note that one reason the above is tons faster is that even with complex unicode, the *common* case is going to be that the names match with a binary compare. > But I really hope that you are not proposing to use the case-ignoring > hash when we are _not_ on a case-challenged filesystem... I actually suspect that we could, and nobody will notice. The hash would cause a few more collissions, but not so you'd know. And the thing is, people who work with other people who are on case-challenged systems would still want to have the case-insenstive compare too - although it should just warn, not actually "work". Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html