Re: XDL_FAST_HASH can be very slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have been working with Peff on this and have more results to share.

For background, xdl_hash_record is a hashing function, producing an
unsigned long from an input string terminated by either a newline or
the end of the mmap'd file.

The original xdl_hash_record is essentially DJB hash, which does a
multiplication, load, and xor for each byte of the input.  Commit
6942efc introduces an "XDL_FAST_HASH" version of the same function
that is clearly inspired by the DJB hash, but it does only one
multiplication, load, and xor for each 8 bytes of input -- i.e., fewer
loads, but also a lot less bit mixing.  Less mixing means far more
collisions, leading to the performance problems with evil-icons.  It's
not clear to me if the XDL_FAST_HASH version intended to match the
output of the DJB hash function, but it doesn't at all.

Peff has been experimenting with using two modern hash functions, FNV
and Murmur3.  In theory, these should produce fewer collisions than
DJB, but in his measurements, they didn't run diff any faster than
plain DJB.  They do fix the evil-icons problem.

I have implemented two simpler possibilities, both of which fix the
problems diffing the evil-icons repository:

1. An XDL_FAST_HASH implementation that matches the output of the DJB
hash exactly.  Its performance is basically the same as DJB, because
the only thing is does differently is load 8 bytes at a time instead
of 1.  It does all the same ALU operations as DJB.

2. Using (hash % prime_number) instead of (hash & ((1<<hbits)-1)) to
map hash values to buckets in the hash table.  This helps because
there's entropy in the high bits of the hash values that's lost
completely if we just mask off the low hbits bits.  I've chosen prime
numbers that are close to the power-of-two sizes of the table -- e.g.,
32749 instead of 32768 -- so very little space is wasted.  Applying
this change to the XDL_FAST_HASH hash function makes it perform as
well as DJB and Murmur3.  That is, it eliminates the performance
problems with the evil-icons repo.

I evaluated several of the hash functions according to how deep the
chains are in each hash bucket, when diffing the evil-icons repo.
DJB, Murmur3, and XDL_FAST_HASH%prime all produce near-optimal
scattering, with the longest chain between 29 and 34 elements long.
XDL_FAST_HASH as implemented in the current git tree -- with
bit-masking instead of modulo-prime -- produces 100 buckets with chain
lengths over 4000.  Most of the other buckets are empty.  Each of
these long chains takes quadratic time to build and linear time to
traverse, which presumably is where the terrible performance for
evil-icons comes from.

The bottom line is, I think XDL_FAST_HASH needs to go, because it has
poorly understood collision behavior with pretty bad worst cases.  I
don't have strong feelings about what should replace it -- original
DJB, a fixed XDL_FAST_HASH, Murmur3, or something else.  All of the
replacements have good collision behavior and good behavior on the
repos I've tested, but appear to be a few percent slower in the common
case.

--Patrick
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]