Vicent Marti <tanoku@xxxxxxxxx> writes: > This commit brings `khash.h`, a header only hash table implementation > that while remaining rather simple (uses quadratic probing and a > standard hashing scheme) and self-contained, offers a significant > performance improvement in both insertion and lookup times. > > `khash` is a generic hash table implementation that can be 'templated' > for any given type while maintaining good performance by using preprocessor > macros. This specific version has been modified to define by default a > `khash_sha1` type, a map of SHA1s (const unsigned char[20]) to void * > pointers. > > When replacing the old hash table implementation in `pack-objects` with > the khash_sha1 table, the insertion time is greatly reduced: > > kh_put_sha1 :: 284.011ms > add_object_entry_1 : 36.06ms > hashcmp :: 24.045ms > > This reduction of more than 50% in the insertion and lookup times, > although nice, is not particularly noticeable for normal `pack-objects` > operation: `pack-objects` performs massive batch insertions and > relatively few lookups, so `khash` doesn't get a chance to shine here. > > The big win here, however, is in the massively reduced amount of hash > collisions (as you can see from the huge reduction of time spent in > `hashcmp` after the change). These greatly improved lookup times > will result critical once we implement the writing algorithm for bitmap > indxes in a later patch of this series. Is that reduction in collisions purely because it uses quadratic probing, or is there some other magic trick involved? Is the same also applicable to the other users of the "big" object hash table? (I assume Peff has already tried applying it there, but I'm still curious...) -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html