On Tue, Jun 25, 2013 at 04:03:22PM +0200, Thomas Rast wrote: > > The big win here, however, is in the massively reduced amount of hash > > collisions (as you can see from the huge reduction of time spent in > > `hashcmp` after the change). These greatly improved lookup times > > will result critical once we implement the writing algorithm for bitmap > > indxes in a later patch of this series. > > Is that reduction in collisions purely because it uses quadratic > probing, or is there some other magic trick involved? Is the same also > applicable to the other users of the "big" object hash table? (I assume > Peff has already tried applying it there, but I'm still curious...) I haven't done any actual timings yet. The general code is quite similar to our object.c hash table, with the exception that it does quadratic probing. I did try quadratic probing on our object.c hash once and didn't see much improvement (similarly, Junio tried cuckoo hashing, but the numbers were not that exciting). It's possible that the hash table in pack-objects did not behave as well as the one in object.c. It looks like we grow it when the table is 3/4 full, which is a little high (we grow at 1/2 in object.c). Quadratic probing should help when the hash table is close to full, so it would probably help. However, I also note that khash keeps its hash tables only half full, so that may be the real source of the performance improvement. So I suspect two things (but as I said, haven't verified): 1. You could speed up pack-objects just by keeping the table half full rather than 3/4 full. 2. You would see little to no speedup by moving object.c to khash, as it is adding only quadratic probing. With quadratic probing, you could potentially tweak the kh_put_* to resize less aggressively (say, 2/3) and save some memory without loss of performance. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html