Re: I'm a total push-over..

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/23/08, Andreas Ericsson <ae@xxxxxx> wrote:
> Marko Kreen wrote:
> > (not lookup3) because lookup3 beat easily all the "simple" hashes
>
> By how much? FNV beat Linus' hash by 0.01 microseconds / insertion,
> and 0.1 microsecons / lookup. We're talking about a case here where
> there will never be more lookups than insertions (unless I'm much
> mistaken).

FNV is around 40% slower than lookup3 on my Intel Core CPU, on 4byte aligned
input. See below for more detailed info.

> > If you don't mind few percent speed penalty compared to Jenkings
> > own optimized version, you can use my simplified version:
> >
> >   http://repo.or.cz/w/pgbouncer.git?a=blob;f=src/hash.c;h=5c9a73639ad098c296c0be562c34573189f3e083;hb=HEAD
>
> I don't, but I don't care that deeply either. On the one hand,
> it would be nifty to have an excellent hash-function in git.
> On the other hand, it would look stupid with something that's
> quite clearly over-kill.

Jenkins hash is fast because it does not look at individual bytes.
If you _do_ want to look at them for unrelated reasons, (case-insensitive,
unicode-juggling), then it obiously loses the point.  That is, if you
want to process the string in one go.

> > It works always with "native" endianess, unlike Jenkins fixed-endian
> > hashlittle() / hashbig().  It may or may not matter if you plan
> > to write values on disk.
> >
> > Speed-wise it may be 10-30% slower worst case (in my case sparc-classic
> > with unaligned data), but on x86, lucky gcc version and maybe
> > also memcpy() hack seen in system.h, it tends to be ~10% faster,
> > especially as it does always 4byte read in main loop.
>
> It would have to be a significant improvement in wall-clock time
> on a test-case of hashing 30k strings to warrant going from 6 to 80
> lines of code, imo. I still believe the original dumb hash Linus
> wrote is "good enough".

Well, ad-hoc dumb hashes may have horrible worst-cases that you cant
see with light testing.  Therefore I'd still suggest some better
researched dumb hash (eg. FNV or OAT).

> On a side-note, it was very interesting reading, and I shall have
> to add jenkins3_mkreen() to my test-suite (although the "keep
> copyright note" license thing bugs me a bit).

Sorry.  I just used template boilerplate.  Considering all the
hard work was done by other people, it not proper to put under
my own license.  I tagged the file as 'public domain' and pushed out.

Btw, the reason I started cleaning lookup3 was that at first I was
scared of the complexity of Jenkins code and decided to go with
Hsieh hash.  Then I found out that Hsieh code is under totally
werdo license (http://www.azillionmonkeys.com/qed/weblicense.html)
so I could not use it.

====================================================================

Here is my raw-speed test of different hashes.  Input is 4-byte
aligned which should be common case for malloc()-ed strings.
This also is best case for original lookup3(), on unaligned
input the memcpy variants beat it easily.  Input string
length varies randomly in range 0..64.

own_memcpy - last 12-byte memcpy() calls out to libc
memcpy_hack - last memcpy is inlined bytewise copy loop:

  while (len--) *dst++ = *src++;

Note that is is raw-speed test, if you benchmark larger code the
hash difference probably matters less.

--------------------------------------------------------------------

Testing: seed=34 align=4 minlen=0 maxlen=64 trycnt=2 duration=10

lookup3 : try=0: ... 247.4880 MB/s
lookup3 : try=1: ... 247.6154 MB/s
own_memcpy: try=0: ... 223.5508 MB/s
own_memcpy: try=1: ... 223.5830 MB/s
memcpy_hack: try=0: ... 241.2241 MB/s
memcpy_hack: try=1: ... 241.2492 MB/s
lookup2 : try=0: ... 190.2697 MB/s
lookup2 : try=1: ... 190.3283 MB/s
fnv     : try=0: ... 153.0318 MB/s
fnv     : try=1: ... 153.0178 MB/s
hsieh   : try=0: ... 234.0468 MB/s
hsieh   : try=1: ... 234.0426 MB/s
oat     : try=0: ... 154.7804 MB/s
oat     : try=1: ... 154.8226 MB/s
elf     : try=0: ... 125.5892 MB/s
elf     : try=1: ... 125.5734 MB/s

Results compared to reference:

lookup3         : 100.000 %
own_memcpy      :  90.311 %
memcpy_hack     :  97.449 %
lookup2         :  76.872 %
fnv             :  61.815 %
hsieh           :  94.544 %
oat             :  62.533 %
elf             :  50.729 %


-- 
marko
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux