Jeff King wrote: > Here's the patch. For some reason my numbers aren't quite as large as > they were yesterday (I was very careful to keep the system unloaded > today, whereas yesterday I was doing a few other things, so perhaps that > is the difference). > > -- >8 -- > Subject: [PATCH] hashcmp: assert constant hash size > > Prior to 509f6f62a4 (cache: update object ID functions for > the_hash_algo, 2018-07-16), hashcmp() called memcmp() with a > constant size of 20 bytes. Some compilers were able to turn > that into a few quad-word comparisons, which is faster than > actually calling memcmp(). > > In 509f6f62a4, we started using the_hash_algo->rawsz > instead. Even though this will always be 20, the compiler > doesn't know that while inlining hashcmp() and ends up just > generating a call to memcmp(). > > Eventually we'll have to deal with multiple hash sizes, but > for the upcoming v2.19, we can restore some of the original > performance by asserting on the size. That gives the > compiler enough information to know that the memcmp will > always be called with a length of 20, and it performs the > same optimization. > > Here are numbers for p0001.2 run against linux.git on a few > versions. This is using -O2 with gcc 8.2.0. > > Test v2.18.0 v2.19.0-rc0 HEAD > ------------------------------------------------------------------------------ > 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0% > > You can see that v2.19 is a little slower than v2.18. This > commit ended up slightly faster than v2.18, but there's a > fair bit of run-to-run noise (the generated code in the two > cases is basically the same). This patch does seem to be > consistently 1-2% faster than v2.19. > > I tried changing hashcpy(), which was also touched by > 509f6f62a4, in the same way, but couldn't measure any > speedup. Which makes sense, at least for this workload. A > traversal of the whole commit graph requires looking up > every entry of every tree via lookup_object(). That's many > multiples of the numbers of objects in the repository (most > of the lookups just return "yes, we already saw that > object"). > > Reported-by: Derrick Stolee <stolee@xxxxxxxxx> > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > cache.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) Reviewed-by: Jonathan Nieder <jrnieder@xxxxxxxxx> Verified using "make object.s" that the memcmp call goes away. Thank you.