On 09/12/2014 04:39 PM, Linus Torvalds wrote:
On Fri, Sep 12, 2014 at 12:52 PM, Josef Bacik <jbacik@xxxxxx> wrote:
So that looks much better, not perfect but hlist_for_each through 5 entries
isn't going to kill us, I'll build a kernel with this and get back shortly
with real numbers.
So one thing to look out for is that the fold_hash() is only effective
for 64-bit architectures, and I'm pretty sure that we end up having a
very similar issue on 32-bit architectures that have
- two 32-bit words with the fairly simplisting mixing (basically
"first word * 9 + second word")
This is probably ok'ish, in that you end up still having a 32-bit
number with *most* of the bits being relevant, but we should think
about this too.
- the real problem is likely the "d_hash()" function that narrows the
32-bit hash down to d_hash_mask bits, and drops a lot of bits as it
does so (ie d_hash_mask is usually on the order of 12 bits, so it
really only uses 24 bits of the 32-bit hash value, and 12 bits is
particularly bad anyway, since it's a multiple of three which is what
the '9' in the first simplistic mixing used too.
So it *might* be better to fix this all at d_hash() instead, and make
that use "hash_u32(hash, d_hash_bits)" instead. And maybe keep the
fold_hash() stage fairly stupid.
Anyway, what I'm trying to say is that your numbers are certainly
*much* better than the hash bucket list used to be, but we should
probably tweak it a bit more. I'm not sure you really need to go to a
real hash function like murmur, but it would also be a good idea to at
least look at making some initial value be randomized so that people
also can't quite as easily perhaps try to hit worst-case situations
where they control the hash and can make really long buckets (the fact
that we mix in the parent pointer and many filesystems have some
directory size limit might make that harder anyway, but..)
Ok I have a good direction to take this in, so far the best results have
been to change fold_hash() to hash_64(hash, 32) and change d_hash to do this
static int *d_hash(int *table, unsigned long parent, unsigned int hash)
{
hash += (unsigned long) parent / L1_CACHE_BYTES;
hash += hash_32(hash, d_hash_shift);
return table + (hash & d_hash_mask);
}
which has gotten us 890k buckets. I'll futz with it some more on Monday
and send out a patch with whatever gives me the best numbers. Thanks
for the help, hashing is not my forte,
Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html