Re: NIST's policy: sha-1 until 2010, after 2010 sha-2.

David Brown <git@xxxxxxxxxx> · Fri, 28 Dec 2007 21:07:12 -0800

On Sat, Dec 29, 2007 at 03:45:35PM +1100, David Symonds wrote:
On Dec 29, 2007 3:27 PM, J.C. Pizarro <jcpiza@xxxxxxxxx> wrote:
Dear Linus Torvalds,

What do you think to do when your git has to change from SHA-1 to SHA-2
  because of the weaker collision-resistance of SHA-1 in the next years?

    (e.g. from an damn developer trying to commit a collisioned-SHA-1 file)

It's a non-issue. The closest-to-practical attack method on SHA-1 is a
collision-finding attack, not a second pre-image attack, which means
you can find two messages with the same hash. As far as I know,
there's no significant weakness known for finding a pre-image, which
would be the most practical way of weakening Git's "security" via
SHA-1 substitution.

<http://en.wikipedia.org/wiki/Birthday_attack> has some good background on
the "problem".

I suppose when SHA-1 is broken and people can generate arbitrary files with
the same hash, it would be possible to use this to make files that were
annoying to try and use with Git.  Git wouldn't have any problem with
normal colliding files, since it hashes the files with a prefix, so the
files would have to be generated specifically for git.

Regardless, the use of SHA-1 in Git isn't primarily for security,
though it is a nice side-effect. The SHA-1 is there for reliability in
addressing and as a good hash.

Given a method for producing a colliding pair for SHA1, it would be
possible to check in a version of a file and later replace it in a
repository with the other version without detection.  The current pairs for
MD5 contain blocks of binary data, so this would be fairly obvious if it
got checked into source code.

It would also only replace the blob on a compromised machine.  Anyone who
has already pulled the blob wouldn't download the new one.

As far as a collision occurring accidentally, according to the Wiki page
(the math looks right), for a 128-bit hash, 820 billion objects would have
a 10^(-15) probability of a collision.  SHA-1 is 160 bits, so the
probability is even lower.

The possible (or even likely) breaking of SHA-1 is only for intentional
collisions.  SHA-1 as a non-colliding hash function should be good for
trillions of objects, and that's all in the same repo.

It might be worth tossing around ideas for using a larger hash in a fairly
long-term future, though.

Dave
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html