Re: x86 SHA1: Faster than OpenSSL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 5 Aug 2009, Junio C Hamano wrote:

> Nicolas Pitre <nico@xxxxxxx> writes:
> 
> > On Wed, 5 Aug 2009, Linus Torvalds wrote:
> >
> >> But while looking at 32-bit issues, I noticed that I really should also 
> >> cast 'len' when shifting it. Otherwise the thing is limited to fairly 
> >> small areas (28 bits - 256MB). This is not just a 32-bit problem ("int" is 
> >> a signed 32-bit thing even in a 64-bit build), but I only noticed it when 
> >> looking at 32-bit issues.
> >
> > Even better is to not shift len at all in SHA_update() but shift 
> > ctx->size only at the end in SHA_final().  It is not like if 
> > SHA_update() could operate on partial bytes, so counting total bytes 
> > instead of total bits is all you need.  This way you need no cast there 
> > and make the code slightly faster.
> 
> Like this?

Almost (see below).

> By the way, Mozilla one calls Init at the end of Final but block-sha1
> doesn't.  I do not think it matters for our callers, but on the other hand
> FInal is not performance critical part nor Init is heavy, so it may not be
> a bad idea to imitate them as well.  Or am I missing something?

It is done only to make sure potentially crypto sensitive information is 
wiped out of the ctx structure instance.  In our case we have no such 
concerns.

> diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
> index eef32f7..8293f7b 100644
> --- a/block-sha1/sha1.c
> +++ b/block-sha1/sha1.c
> @@ -31,7 +31,7 @@ void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len)
>  {
>  	int lenW = ctx->lenW;
>  
> -	ctx->size += (unsigned long long) len << 3;
> +	ctx->size += (unsigned long long) len;

You can get rid of the cast as well now.

>  	/* Read the data into W and process blocks as they get full
>  	 */
> @@ -68,6 +68,7 @@ void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx)
>  
>  	/* Pad with a binary 1 (ie 0x80), then zeroes, then length
>  	 */
> +	ctx->size <<= 3; /* bytes to bits */
>  	padlen[0] = htonl(ctx->size >> 32);
>  	padlen[1] = htonl(ctx->size);

Instead, I'd do:

	padlen[0] = htonl(ctx->size >> (32 - 3));
	padlen[1] = htonl(ctx->size << 3);

That would eliminate a redundant write back of ctx->size.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]