Re: MD5 To Be Considered Harmful Someday

Jack Lloyd <lloyd@xxxxxxxxxxxxx> · Wed, 8 Dec 2004 13:43:07 -0700

On Tue, Dec 07, 2004 at 06:46:20PM -0700, Joel Maslak wrote:

> The short-term fix seems to be something I've been recommending for a
> while:
> 
> Compute hashes with both SHA-1 and MD5.
> 
> The chance of one algorithm becoming compromised in the mid-term is
> relatively high IMHO (I was responsible for a PKI system which had to keep
> integrity for 20 year periods of time - not an easy task considering what
> we don't know about the future).  The chance of two becoming compromised
> is relatively less.  The chance of a problem with MD5 and SHA-1 allowing
> two different files to have collisions in both algorithms in *BOTH* is
> very very small.

Actually there are without a doubt many files where MD5 and SHA-1 both collide;
this is a simple result from the fact that you have nearly arbitrary sized
inputs (up to 2^61 bytes) and a very small output. Even if you idealize
MD5||SHA as a 288 bit hash function, you get collisions after ~2^144 tests by
the birthday paradox, same as any other hash.  Which I suppose counts as very
very small, and is probably sufficient for 20 year security, but that estimate
ignores the fact that MD5||SHA is not an ideal 288 bit hash.

The most obvious example of that is that by using one of the known MD5
collision pairs, you can cause 5/9 of the hash output to change while keeping
the rest of the hash constant. While this is not a problem when the hash is
merely a hash, it does mean you can't realistically model it as a PRF.

I wouldn't be surprised if there is some way to break this much faster than
2^144 by taking advantage of the fact that you can compute each half of the
hash independently of the other, but I can't think of a convincing argument for
this at the moment.

Jack