Re: RFC v3: Another proposed hash function transition plan

Jeff King <peff@xxxxxxxx> · Mon, 2 Oct 2017 15:37:31 -0400

On Mon, Oct 02, 2017 at 10:18:02AM -0700, Linus Torvalds wrote:

> On Mon, Oct 2, 2017 at 7:00 AM, Jason Cooper <jason@xxxxxxxxxxxxxx> wrote:
> >
> > Ahhh, so if I understand you correctly, you'd prefer SHA-256 over
> > SHA3-256 because it's more performant for your usecase?  Well, that's a
> > completely different animal that cryptographic suitability.
> 
> In almost all loads I've seen, zlib inflate() cost is a bigger deal
> than the crypto load. The crypto people talk about cycles per byte,
> but the deflate code is what usually takes the page faults and cache
> misses etc, and has bad branch prediction. That ends up easily being
> tens or thousands of cycles, even for small data.

If anyone is interested in the user-visible effects of slower crypto, I
think, there are some numbers in 8325e43b82 (Makefile: add DC_SHA1 knob,
2017-03-16). I don't know how SHA-256 compares to sha1dc exactly, but
certainly the latter is a lot slower than normal sha1.

The only real-world case I found with a noticeable slowdown was
index-pack.  Which in the worst case is roughly the same operation as
"git fsck" (inflate and compute the sha1 on every byte), but people tend
to actually do it a lot more often.

And it really _is_ slower for real-world operations; the CPU for
computing the sha1 of an incoming clone of linux.git jumped from ~3
minutes to ~6 minutes.  But I don't think we've seen a lot of
complaints, probably because that time is lumped in with "time to
transfer a gigabyte of data", so unless you're on a slow machine on fast
connection, you don't even really notice.

For day-to-day operations in a repository, I never came up with a good
example where the speed difference mattered. I think Dscho's giant-index
example is an outlier and the right answer there is not "pick a fast
crypto algorithm" but "stop using a slow crypto algorithm as a
checksum" (and also, stop routinely reading and writing 400MB for
day-to-day operations).

-Peff