Re: [PATCH] git gc: Speed it up by 18% via faster hash comparisons

Dmitry Potapov <dpotapov@xxxxxxxxx> · Thu, 28 Apr 2011 13:31:53 +0400

2011/4/28 Ingo Molnar <mingo@xxxxxxx>:
>
> * Dmitry Potapov <dpotapov@xxxxxxxxx> wrote:
>
>> 2011/4/28 Ingo Molnar <mingo@xxxxxxx>:
>> > +static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
>> >  {
>> > -       return !memcmp(sha1, null_sha1, 20);
>> > +       int i;
>> > +
>> > +       for (i = 0; i < 20; i++, sha1++, sha2++) {
>> > +               if (*sha1 != *sha2) {
>>
>> At the very least, you may want to put 'likely' in this 'if'
>> condition, otherwise the compiler may optimize this loop in
>> the same way as with memcmp. So, it may work well now, but
>> it may not work much slower with future versions or different
>> level of optimization. (AFAIK, -O3 is far more aggressive in
>> optimizing of loops).
>
> the main difference is between the string assembly instructions and the loop.
> Modern CPUs will hardly notice this loop being emitted with slight variations
> by the compiler. So i do not share this concern.

Here you make an assumption what kind of optimization the compiler
can do. As Jonathan noticed above, theoretically a smart compiler
can turn this loop into memcmp (or code very similar to memcmp).

The reason why memcmp does not work well is that it is optimized
for the worst case scenario (where beginning of two strings is
the same), while _we_ know that with a hash it very unlikely,
and we want to conduct this knowledge to the compiler in some
way. Just re-writing memcmp as explicit loop does not conduct
this knowledge.

Therefore, I believe it makes sense to add 'likely'. I have not
tested this code, but in the past, I had a very similar code
which was compiled with -O3, and just putting likely turned out
to 40% speed-up for that comparison function.

Dmitry
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html