On Wed, Sep 14, 2016 at 07:01:41PM -0700, Stefan Beller wrote: > According to Jeff, sending patches that don't get accepted is the new hype! It is what all the cool kids are doing. Unfortunately, it does not save you from nitpicky reviews...;) > first = i = hash_obj(sha1, obj_hash_size); > + clock_gettime(CLOCK_MONOTONIC, &time1); > while ((obj = obj_hash[i]) != NULL) { > if (!hashcmp(sha1, obj->oid.hash)) > break; > @@ -98,6 +131,9 @@ struct object *lookup_object(const unsigned char *sha1) > if (i == obj_hash_size) > i = 0; > } > + clock_gettime(CLOCK_MONOTONIC, &time2); > + diff(&time1, &time2, &t_diff); > + add_time_to(&caching, &t_diff); > if (obj && i != first) { I don't think this is actually measuring the time spent on collisions. It's measuring the time we spend in hashcmp(), but that includes the non-collision case where we find it in the first hashcmp. Measuring _just_ the collisions is more like the patch below. In my measurements it's more like 30ms, compared to 10s for all of the hashcmps. So we really aren't dealing with collisions, but rather just verifying that our hash landed at the right spot. And _any_ data structure is going to have to do that. If you want to make it faster, I'd try optimizing hashcmp (and you can see why the critbit tree was slower; if we spend so much time just on hashcmp() to make sure we're at the right key, then making that slower with a bunch of branching is not going to help). I notice we still open-code hashcmp. I get a slight speedup by switching it to memcmp(). About 2.5%, which is similar to what I showed in http://public-inbox.org/git/20130318073229.GA5551@xxxxxxxxxxxxxxxxxxxxx/ a few years ago (though it's more pronounced as a portion of the whole now, because we've optimized some of the other bits). The main driver there was memcmp() improvements that went into glibc 2.13 several years ago. It might be time to start assuming that memcmp() beats our open-coded loop. It may also be possible to really micro-optimize it on some platforms, because we know the size in advance (I'd kind of expect the compiler to do that, but if we're ending up in glibc memcmp then it sounds like it is not the case). -Peff