Re: [PATCH 2/2] hashcmp: use memcmp instead of open-coded loop

René Scharfe <l.s.r@xxxxxx> · Wed, 9 Aug 2017 16:55:43 +0200

Am 09.08.2017 um 12:16 schrieb Jeff King:
> In 1a812f3a70 (hashcmp(): inline memcmp() by hand to
> optimize, 2011-04-28), it was reported that an open-coded
> loop outperformed memcmp() for comparing sha1s.
> 
> Discussion[1] a few years later in 2013 showed that this
> depends on your libc's version of memcmp(). In particular,
> glibc 2.13 optimized their memcmp around 2011. Here are
> current timings with glibc 2.24 (best-of-five, on
> linux.git):
> 
>    [before this patch, open-coded]
>    $ time git rev-list --objects --all
>    real	0m35.357s
>    user	0m35.016s
>    sys	0m0.340s
> 
>    [after this patch, memcmp]
>    real	0m32.930s
>    user	0m32.630s
>    sys	0m0.300s

Nice.  And here's the size of the git executable in my build:

         unstripped stripped
  before    8048176  2082416
  after     8006064  2037360

> I also wondered if using memcmp() could be a hint to the compiler to use
> an intrinsic or some other trick, especially because the "len" here is a
> constant. But in a toy function compiled with "gcc -S", it looks like we
> do keep the call to memcmp (so the speedup really is glibc, and not some
> compiler magic).

GCC 7 inlines memcmp() if we only need a binary result:

	https://godbolt.org/g/iZ11Ne

René