On Fri, 25 Jan 2008, Marko Kreen wrote: > > Well, although this is very clever approach, I suggest against it. > You'll end up with complex code that gives out substandard results. Actually, *your* operation is the one that gives substandard results. > I think its better to have separate case-folding function (or several), > that copies string to temp buffer and then run proper optimized hash > function on that buffer. I'm sorry, but you just cannot do that efficiently and portably. I can write a hash function that reliably does 8 bytes at a time for the common case on a 64-bit architecture, exactly because it's easy to do "test high bits in parallel" with a simple bitwise 'and', and we can do the same with "approximate lower-to-uppercase 8 bytes at a time" for a hash by just clearing bit 5. In contrast, trying to do the same thing in half-way portable C, but being limited to having to get the case-folding *exactly* right (which you need for the comparison function) is much much harder. It's basically impossible in portable C (it's doable with architecture-specific features, ie vector extensions that have per-byte compares etc). And hashing is performance-critical, much more so than the compares (ie you're likely to have to hash tens of thousands of files, while you will only compare a couple). So it really is worth optimizing for. And the thing is, "performance" isn't a secondary feature. It's also not something you can add later by optimizing. It's also a mindset issue. Quite frankly, people who do this by "convert to some folded/normalized form, then do the operation" will generally make much more fundamental mistakes. Once you get into the mindset of "let's pass a corrupted strign around", you are in trouble. You start thinking that the corrupted string isn't really "corrupt", it's in an "optimized format". And it's all downhill from there. Don't do it. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html