Giuseppe Scrivano wrote: > Hi Pádraig, > > I tried to reproduce your results but I wasn't able to do it. The > biggest difference on a 300MB file I noticed was approximately 15% using > on both implementations -O2, and 5% using -O3. > My GCC version is "gcc (Debian 4.3.3-14) 4.3.3" and the CPU is: Intel(R) > Pentium(R) D CPU 3.20GHz. > > I also spent some time trying to improve the gnulib SHA1 implementation > and it seems a lookup table can improve things a bit. > > Can you please try the patch that I have attached and tell me which > performance difference (if any) you get? Thanks for looking at this Giuseppe and sorry for not mentioning my GCC and CPU. Note the binaries below is compiled with $(rpm -q --qf="%{OPTFLAGS}\n" coreutils) for consistency, which on my F11 machines is: -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i586 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE=1 Testing on 2 machines I have here: $ rpm -q gcc gcc-4.4.1-2.fc11.i586 $ grep "model name" /proc/cpuinfo | head -n1 | tr -s '[:blank:]' ' ' model name : Intel(R) Pentium(R) M processor 1.70GHz $ truncate -s300MB sha1.test $ time sha1sum sha1.test real 0m3.540s $ time linus-sha1 sha1.test real 0m2.319s (-34%) $ time giuseppe-sha1sum sha1.test real 0m3.513s (-.8%) $ rpm -q gcc gcc-4.4.1-2.fc11.i586 $ grep "model name" /proc/cpuinfo | head -n1 | tr -s '[:blank:]' ' ' model name : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz $ truncate -s300MB sha1.test $ time sha1sum sha1.test real 0m1.857s $ time linus-sha1 sha1.test real 0m1.102s (-40%) $ time giuseppe-sha1sum sha1.test real 0m1.932s (+ 4%) cheers, Pádraig. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html