On Thu, Mar 1, 2012 at 4:46 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote: > > There should be generally better modern general hash algorithms around, > like murmur, cityhash or snoopy. Perhaps even the fnv we have in tree, > but it's somewhat dated by know. > > They all have larger code, but if it's really that hot it would be worth > it. The quality of our hash function really doesn't seem to be the issue. In fact, my (other - I've done both cache-hot and cache-cold) profiles seem to indicate that the only access that ever matters is the first one (ie the read from the hash table, not the following of the chains). I have a dim memory of us playing with the hash function itself long long ago, and what made a difference was using the right bits from the "parent" pointer, not the hash of the name itself. That's especially true since many filenames are really just in the 3-8 character length thing, so creating a 32-bit hash from that is not going to have lots of collisions. We basically have "two hashes": we have the full 32-bit "hash number" that we compute over the filename (and also compare at lookup time before we do a name compare), and then we have the "bucket number" which is the actual hash chain number. The bucket number is generate from both the filename hash number and the parent pointer, and then we try to mix bits around with that GOLDEN_RATIO_PRIME thing. And last time this was an issue, it was the "mixing bits around" that made a difference to proper bucket use. See commit 99effef9544e: "[PATCH] dentry and inode cache hash algorithm performance changes." in the historical BK import by Thomas. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html