Re: RFC v3: Another proposed hash function transition plan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Johannes,

thanks for your response and taking the effort to express your concerns. Please see below for some feedback.

On 30/09/17 00:33, Johannes Schindelin wrote:
Hi Joan,

On Fri, 29 Sep 2017, Joan Daemen wrote:

if ever there was a SHA-2 competition, it must have been held inside NSA:-)
Oops. My bad, I indeed got confused about that, as you suggest below (I
actually thought of the AES competition, but that was obviously not about
SHA-2). Sorry.

But maybe you are confusing with the SHA-3 competition. In any case,
when considering SHA-2 vs SHA-3 for usage in git, you may have a look at
arguments we give in the following blogpost:

https://keccak.team/2017/open_source_crypto.html
Thanks for the pointer!

Small nit: the post uses "its" in place of "it's", twice.

Thanks, we'll correct that.

It does have a good point, of course: the scientific exchange (which you
call "open-source" in spirit) makes tons of sense.

As far as Git is concerned, we not only care about the source code of the
hash algorithm we use, we need to care even more about what you call
"executable": ready-to-use, high quality, well-tested implementations.

We carry source code for SHA-1 as part of Git's source code, which was
hand-tuned to be as fast as Linus could get it, which was tricky given
that the tuning should be general enough to apply to all common intel
CPUs.

This hand-crafted code was blown out of the water by OpenSSL's SHA-1 in
our tests here at Microsoft, thanks to the fact that OpenSSL does
vectorized SHA-1 computation now.

To me, this illustrates why it is not good enough to have only a reference
implementation available at our finger tips. Of course, above-mentioned
OpenSSL supports SHA-256 and SHA3-256, too, and at least recent versions
vectorize those, too.

There is a lot of high-quality optimized code for all SHA-3 functions and many CPUs in the Keccak code package https://github.com/gvanas/KeccakCodePackage but also OpenSSL contains some good SHA-3 code and then there are all those related to Ethereum.

By the way, you speak about SHA3-256, but the right choice would be to use SHAKE128. Well, what is exactly the right choice depends on what you want. If you want to have a function in the SHA3 standard (FIPS 202), it is SHAKE128. You can boost performance on high-end CPUs by adopting Parallelhash from NIST SP 800-185, still a NIST standard. You can multiply that performance again by a factor of 2 by adopting KangarooTwelve. This is our (Keccak team) proposal for a parallelizable Keccak-based hash function that has a safety margin comparable to that of the SHA-2 functions. See https://keccak.team/kangarootwelve.html
May I also suggest you read https://keccak.team/2017/is_sha3_slow.html

Also, ARM processors have become a lot more popular, so we'll want to have
high-quality implementations of the hash algorithm also for those
processors.

Likewise, in contrast to 2005, nowadays implementations of Git in
languages as obscure as Javascript are not only theoretical but do exist in practice (https://github.com/creationix/js-git). I had a *very* quick
look for libraries providing crypto in Javascript and immediately found
the Standford Javascript Crypto library
(https://github.com/bitwiseshiftleft/sjcl/) which seems to offer SHA-256
but not SHA3-256 computation.

Back to Intel processors: I read some vague hints about extensions
accelerating SHA-256 computation on future Intel processors, but not
SHA3-256.

It would make sense, of course, that more crypto libraries and more
hardware support would be available for SHA-256 than for SHA3-256 given
the time since publication: 16 vs 5 years (I am playing it loose here,
taking just the year into account, not the exact date, so please treat
that merely as a ballpark figure).

So from a practical point of view, I wonder what your take is on, say,
hardware support for SHA3-256. Do you think this will become a focus soon?

I think this is a chicken-and-egg problem. In any case, hardware support for one SHA3-256 will also work for the other SHA3 and SHAKE functions as they all use the same underlying primitive: the Keccak-f permutation. This is not the case for SHA2 because SHA224 and SHA256 use a different compression function than SHA384, SHA512, SHA512/224 and SHA512/256.

Also, what is your take on the question whether SHA-256 is good enough?
SHA-1 was broken theoretically already 10 years after it was published
(which unfortunately did not prevent us from baking it into Git), after
all, while SHA-256 is 16 years old and the only known weakness does not
apply to Git's usage?

SHA-256 is more conservative than SHA-1 and I don't expect it to be broken in the coming decades (unless NSA inserted a backdoor but I don't think that is likely). But looking at the existing cryptanalysis, I think it is even less likely that I SHAKE128, ParallelHash or KangarooTwelve will be broken anytime.

Also, while I have the attention of somebody who knows a heck more about cryptography than Git's top 10 committers combined: how soon do you expect
practical SHA-1 attacks that are much worse than what we already have
seen? I am concerned that if we do not move fast enough to a new hash
algorithm, and somebody finds a way in the meantime to craft arbitrary
messages given a prefix and an SHA-1, then we have a huge problem on
our hands.

This is hard to say. To be honest, when witnessing the first MD5 collisions I did not expect them to lead to some real world attacks and just a few years later we saw real-world forged certificates based on MD5 collisions. And SHA-1 has a lot in common with MD5...

But let me end with a philosophical note. Independent of all the arguments for and against, I think this is ultimately about doing the right thing. The choice is here between SHA1/SHA2 on the one hand and SHA3/Keccak on the other. The former standards are imposed on us by NSA and the latter are the best that came out of an open competition involving all experts in the field worldwide. What would be closest to the philosophy of Git (and by extension Linux or open-source in general)?

Kind regards,

Joan


On 30/09/17 00:33, Johannes Schindelin wrote:
Hi Joan,

On Fri, 29 Sep 2017, Joan Daemen wrote:

if ever there was a SHA-2 competition, it must have been held inside NSA:-)
Oops. My bad, I indeed got confused about that, as you suggest below (I
actually thought of the AES competition, but that was obviously not about
SHA-2). Sorry.

But maybe you are confusing with the SHA-3 competition. In any case,
when considering SHA-2 vs SHA-3 for usage in git, you may have a look at
arguments we give in the following blogpost:

https://keccak.team/2017/open_source_crypto.html
Thanks for the pointer!

Small nit: the post uses "its" in place of "it's", twice.

It does have a good point, of course: the scientific exchange (which you
call "open-source" in spirit) makes tons of sense.

As far as Git is concerned, we not only care about the source code of the
hash algorithm we use, we need to care even more about what you call
"executable": ready-to-use, high quality, well-tested implementations.

We carry source code for SHA-1 as part of Git's source code, which was
hand-tuned to be as fast as Linus could get it, which was tricky given
that the tuning should be general enough to apply to all common intel
CPUs.

This hand-crafted code was blown out of the water by OpenSSL's SHA-1 in
our tests here at Microsoft, thanks to the fact that OpenSSL does
vectorized SHA-1 computation now.

To me, this illustrates why it is not good enough to have only a reference
implementation available at our finger tips. Of course, above-mentioned
OpenSSL supports SHA-256 and SHA3-256, too, and at least recent versions
vectorize those, too.

Also, ARM processors have become a lot more popular, so we'll want to have
high-quality implementations of the hash algorithm also for those
processors.

Likewise, in contrast to 2005, nowadays implementations of Git in
languages as obscure as Javascript are not only theoretical but do exist in practice (https://github.com/creationix/js-git). I had a *very* quick
look for libraries providing crypto in Javascript and immediately found
the Standford Javascript Crypto library
(https://github.com/bitwiseshiftleft/sjcl/) which seems to offer SHA-256
but not SHA3-256 computation.

Back to Intel processors: I read some vague hints about extensions
accelerating SHA-256 computation on future Intel processors, but not
SHA3-256.

It would make sense, of course, that more crypto libraries and more
hardware support would be available for SHA-256 than for SHA3-256 given
the time since publication: 16 vs 5 years (I am playing it loose here,
taking just the year into account, not the exact date, so please treat
that merely as a ballpark figure).

So from a practical point of view, I wonder what your take is on, say,
hardware support for SHA3-256. Do you think this will become a focus soon?

Also, what is your take on the question whether SHA-256 is good enough?
SHA-1 was broken theoretically already 10 years after it was published
(which unfortunately did not prevent us from baking it into Git), after
all, while SHA-256 is 16 years old and the only known weakness does not
apply to Git's usage?

Also, while I have the attention of somebody who knows a heck more about cryptography than Git's top 10 committers combined: how soon do you expect
practical SHA-1 attacks that are much worse than what we already have
seen? I am concerned that if we do not move fast enough to a new hash
algorithm, and somebody finds a way in the meantime to craft arbitrary
messages given a prefix and an SHA-1, then we have a huge problem on
our hands.

Ciao,
Johannes



Begin forwarded message:

From: Gilles Van Assche <gilles.van.assche@xxxxxxxxxxx>
Subject: Re: RFC v3: Another proposed hash function transition plan
Date: 30 Sep 2017 22:20:42 CEST
To: Joan Daemen <joan@xxxxxxxx>, keccak@xxxxxxxxxxx

Dag Joan,

About the implementations, there are many high-quality implementations of Keccak besides the KCP that you could also mention. E.g., those in OpenSSL are very good. And there are all those related to Ethereum.

I tend to agree with Guido regarding SHA-1, even if you are right, there is no need to reduce/excuse too much the impact of collisions, there could be unexpected use cases. And it's not clean. (And don't underestimate the probability to be quoted on this.)

Finally, just to say that I like your last paragraph.

Kind regards,
Gilles




Joan Daemen <joan@xxxxxxxx> wrote:
what about replying with something like this (please have a critical look). I sent this from my Radboud account as I have problems with my Thunderbird settings. When trying to send a mail, it sometimes works and sometimes it says “An error occurred while sending mail: Outgoing server (SMTP) error. The server responded: 4.7.1 <joans-mbp.home>: Helo command rejected: Host not found."
Dear Johannes,
thanks for your response and taking the effort to express your concerns. Please see below for some feedback.
On 30/09/17 00:33, Johannes Schindelin wrote:
Hi Joan,

On Fri, 29 Sep 2017, Joan Daemen wrote:

if ever there was a SHA-2 competition, it must have been held inside NSA:-)
Oops. My bad, I indeed got confused about that, as you suggest below (I
actually thought of the AES competition, but that was obviously not about
SHA-2). Sorry.

But maybe you are confusing with the SHA-3 competition. In any case,
when considering SHA-2 vs SHA-3 for usage in git, you may have a look at
arguments we give in the following blogpost:

https://keccak.team/2017/open_source_crypto.html
Thanks for the pointer!

Small nit: the post uses "its" in place of "it's", twice.
Thanks, we'll correct that.

It does have a good point, of course: the scientific exchange (which you
call "open-source" in spirit) makes tons of sense.

As far as Git is concerned, we not only care about the source code of the
hash algorithm we use, we need to care even more about what you call
"executable": ready-to-use, high quality, well-tested implementations.

We carry source code for SHA-1 as part of Git's source code, which was
hand-tuned to be as fast as Linus could get it, which was tricky given
that the tuning should be general enough to apply to all common intel
CPUs.

This hand-crafted code was blown out of the water by OpenSSL's SHA-1 in
our tests here at Microsoft, thanks to the fact that OpenSSL does
vectorized SHA-1 computation now.

To me, this illustrates why it is not good enough to have only a reference
implementation available at our finger tips. Of course, above-mentioned
OpenSSL supports SHA-256 and SHA3-256, too, and at least recent versions
vectorize those, too.
There is a lot of high-quality optimized code for all SHA-3 functions and many CPUs in the Keccak code package https://github.com/gvanas/KeccakCodePackage

By the way, you speak about SHA3-256, but the right choice would be to use SHAKE128. Well, what is exactly the right choice depends on what you want. If you want to have a function in the SHA3 standard (FIPS 202), it is SHAKE128. You can boost performance on high-end CPUs by adopting Parallelhash from NIST SP 800-185, still a NIST standard. You can multiply that performance again by a factor of 2 by adopting KangarooTwelve. This is our (Keccak team) proposal for a parallelizable Keccak-based hash function that has a safety margin comparable to that of the SHA-2 functions. See https://keccak.team/kangarootwelve.html May I also suggest you to read https://keccak.team/2017/is_sha3_slow.html

Also, ARM processors have become a lot more popular, so we'll want to have
high-quality implementations of the hash algorithm also for those
processors.

Likewise, in contrast to 2005, nowadays implementations of Git in
languages as obscure as Javascript are not only theoretical but do exist
in practice (https://github.com/creationix/js-git). I had a *very* quick
look for libraries providing crypto in Javascript and immediately found
the Standford Javascript Crypto library
(https://github.com/bitwiseshiftleft/sjcl/) which seems to offer SHA-256
but not SHA3-256 computation.

Back to Intel processors: I read some vague hints about extensions
accelerating SHA-256 computation on future Intel processors, but not
SHA3-256.

It would make sense, of course, that more crypto libraries and more
hardware support would be available for SHA-256 than for SHA3-256 given
the time since publication: 16 vs 5 years (I am playing it loose here,
taking just the year into account, not the exact date, so please treat
that merely as a ballpark figure).

So from a practical point of view, I wonder what your take is on, say,
hardware support for SHA3-256. Do you think this will become a focus soon? I think this is a chicken-and-egg problem. In any case, hardware support for one SHA3-256 will also work for the other SHA3 and SHAKE functions as they all use the same underlying primitive: the Keccak-f permutation. This is not the case for SHA2 because SHA224 and SHA256 use a different compression function than SHA384, SHA512, SHA512/224 and SHA512/256.

Also, what is your take on the question whether SHA-256 is good enough?
SHA-1 was broken theoretically already 10 years after it was published
(which unfortunately did not prevent us from baking it into Git), after
all, while SHA-256 is 16 years old and the only known weakness does not
apply to Git's usage?
I think even the weakness of SHA-1 will be hard to exploit to do something bad in Git. SHA-256 is more conservative than SHA-1 and I don't expect it to be broken (unless NSA inserted a backdoor but I don't think that is likely). But I also don't expect SHAKE128, ParallelHash or KangarooTwelve to be broken, looking at the existing cryptanalysis.
Also, while I have the attention of somebody who knows a heck more about
cryptography than Git's top 10 committers combined: how soon do you expect
practical SHA-1 attacks that are much worse than what we already have
seen? I am concerned that if we do not move fast enough to a new hash
algorithm, and somebody finds a way in the meantime to craft arbitrary
messages given a prefix and an SHA-1, then we have a huge problem on
our hands.
As said, I don't expect practical SHA-1 attacks soon. But let me end with a philosophical note. Independent of all the arguments for and against, I think this is about doing the right thing. The choice is here between SHA1/SHA2 on the one hand and SHA3/Keccak on the other. The former standards are imposed on us by NSA and the latter are the best that came out of an open competition involving all experts worldwide. What would be closest to the philosophy of Git (and by extension Linux or open-source in general)?

Kind regards,

Joan





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux