> - Dan timed the sha1dc implementation with and without the collision > detection enabled. The sha1 implementation is only 1.33x slower than > block-sha1 (for raw sha1 time). Adding in the detection makes it > 2.6x slower. > So there's some potential gain from optimizing the sha1 > implementation, but ultimately we may be looking at a 2x slowdown to > add in the collision detection. I rearranged our code a little bit and interleaved the message expansion and rounds. This bring our raw SHA-1 implementation (without collision detection) down to 1.11x slower than the block-sha1 implementation in Git. Adding the collision detection brings us to 2.12x slower than the block-sha1 implementation. This was basically attacking the low hanging fruit in optimizing our implementation. There are some things that I haven't looked into yet, but I'm basically at the point of starting to compare the generated assembler to see what's different between our implementations. OpenSSL's SHA1 implementation is implemented in assembler, so there's no way we're going to get close to that with just C level coding. Thanks, Dan