Artur Skawina wrote: > Linus Torvalds wrote: >> magic? I force the stores to the 512-bit hash bucket to be done in order. >> That seems to help a lot. > > I named it 'linusv': > linusv 0.3697 165.1 I was not going to spend even more time on the C version, but after looking at what gcc does to it, tried this: diff --git a/block-sha1/sha1vol.c b/block-sha1/sha1vol.c --- a/block-sha1/sha1vol.c +++ b/block-sha1/sha1vol.c @@ -93,7 +93,7 @@ void blk_SHA1_Finalv(unsigned char hashout[20], blk_SHA_CTX *ctx) /* This "rolls" over the 512-bit array */ #define W(x) (array[(x)&15]) -#define setW(x, val) (*(volatile unsigned int *)&W(x) = (val)) +#define setW(x, val) W(x) = (val); __asm__ volatile ("": "+m" (W(x))) /* * Where do we get the source from? The first 16 iterations get it from and got a nice improvement: rfc3174 1.436 42.49 linus 0.5843 104.5 linusph 0.5639 108.2 linusv 0.3098 197 linusvph 0.3082 198.1 linusasm 0.5849 104.3 linusp4 0.433 141 linusas 0.4077 149.7 linusas2 0.436 140 mozilla 1.099 55.54 mozillaas 1.295 47.11 openssl 0.2632 231.9 opensslb 0.2395 254.8 spelvin 0.2687 227.2 spelvina 0.2526 241.7 nettle 0.4378 139.4 nettle-ror 0.4379 139.4 nettle-p4sch 0.4231 144.2 The atom numbers didn't change much. artur -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html