Re: [PATCH 0/7] block-sha1: improved SHA1 hashing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Basically, older P4's will I think shift one bit at a time. So while even 
> Prescott is relatively weak in the shifter department, pre-prescott 
> (Willamette and Northwood) are _really_ weak. If your P4 is one of those, 
> you really shouldn't use it to decide on optimizations.

Thanks for the warning.  The only P4 I have a login on is a Willamette, so
I guess it's a bad optimization target:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 1
model name      : Intel(R) Pentium(R) 4 CPU 1.60GHz
stepping        : 2
cpu MHz         : 1593.888
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pebs bts
bogomips        : 3187.77
clflush size    : 64
power management:



BTW, I'm having a bit of a problem with sha1bench.  Because the number
of rounds is multiplied by 10 until it takes more than mintime, and
the watchdog timer is set to 12*mintime, if I just barely miss the
threshold, I can hit the watchdog on any code that's more than 20% slower
than the reference rfc3174 code:

# /tmp/sha1bench 
#Initializing... Rounds: 10000000, size: 625000K, time: 6.904s, speed: 88.41MB/s
#             TIME[s] SPEED[MB/s]
rfc3174         6.912       88.31
# New hash result: 0489b02aee9fbd82b0bb0cba96f8047e42f543b8
rfc3174         6.911       88.31
linus           3.002       203.3
linusas         3.253       187.7
linusas2        3.059       199.6
mozilla         10.86       56.22
mozillaas       10.33       59.09
openssl         2.145       284.5
spelvin         1.933       315.8
spelvina        1.933       315.8
nettle          2.161       282.4

I had to bump it to 20 times to be able to get past the mozilla code.

You can still have nice round numbers with smaller incements of 2x or 2.5x:

    do {
-      rounds *= 10;
+      rounds *= 2;
+      if (rounds % 9 == 4)
+        rounds += rounds/4;     /* Step 1, 2, 5, 10, 20, 50, 100... */
       uWATCHDOG(mintime*20);
       t1 = GETTIME();
       for (i=0; i<rounds; i++) {
          SHA1Input(sc, arr512b, sizeof(arr512b));
          if (i<64) {
             SHA1Result(sc, arr512b+
                     (i+arr512b[i]+arr512b[arr512b[i]%64]+
                      arr512b[63-i]+arr512b[arr512b[63-i]%64]) %
                     (sizeof(arr512b)-20));
             SHA1Reset(sc);
             }
          }
       t2 = GETTIME(); td = t2-t1;
       }
    while (td<mintime);

And yeah, my P4 is touchy:
n# /var/tmp/sha1bench 
#Initializing... Rounds: 500000, size: 31250K, time: 0.9736s, speed: 31.35MB/s
#             TIME[s] SPEED[MB/s]
rfc3174        0.9931       30.73
# New hash result: 2872616106e163ae9c7c8d12a38bef032323c844
rfc3174        0.9491       32.16
linus          0.4906        62.2
linusas        0.5799       52.62
linusas2       0.4859       62.81
mozilla         1.302       23.44
mozillaas       1.234       24.74
openssl         0.226         135
spelvin        0.2298       132.8
spelvina       0.2494       122.4
nettle         0.3687       82.78
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]