Re: Name hashing function causing a perf regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/12/2014 06:01 PM, Linus Torvalds wrote:
On Fri, Sep 12, 2014 at 2:25 PM, Josef Bacik <jbacik@xxxxxx> wrote:

Ok I have a good direction to take this in, so far the best results have
been to change fold_hash() to hash_64(hash, 32) and change d_hash to do this

static int *d_hash(int *table, unsigned long parent, unsigned int hash)
{
         hash += (unsigned long) parent / L1_CACHE_BYTES;
         hash += hash_32(hash, d_hash_shift);
         return table + (hash & d_hash_mask);
}


It really should be sufficient to just do

   static int *d_hash(int *table, unsigned long parent, unsigned int hash)
   {
         hash += (unsigned long) parent / L1_CACHE_BYTES;
         return table + hash_32(hash, d_hash_shift);
   }

because the hash_32() function should already reduce the hash to its
second argument (d_hash_shift) and mix around the bits sufficiently.

I'm a *bit* nervous about the cost of this all, especially on CPU's
where integer multiplies are expensive, but obviously we need to
improve on the final hashing.

Just out of interest, how good/bad does the hash look if the *only*
change you do is the above d_hash() thing (ie just leave the
fold_hash() thing alone?)




[jbacik@devbig005 ~/local] ./hash
Old hash table had 1000000 entries, 0 dupes, 0 max dupes
New hash table had 62200 entries, 937800 dupes, 90 max dupes
Fnv hash table had 988406 entries, 11594 dupes, 3 max dupes
We had 49800 buckets with a p50 of 8 dupes, p90 of 45 dupes, p99 of 80 dupes for the new hash

"New hash" is the number you want. Not as bad as originally, not as good as replacing fold_hash with hash_64 and not changing d_hash. Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux