Re: SHA1 collisions found

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 27, 2017 at 10:57:37AM +0100, Geert Uytterhoeven wrote:

> > Yeah, that is a lot more flexible for experimenting. Though I'd think
> > you'd probably want more than 4 bits just to avoid accidental
> > collisions. Something like 24 bits gives you some breathing space (you'd
> > expect a random collision after 4096 objects), but it's still easy to
> > do a preimage attack if you need to.
> 
> Just shortening the hash causes lots of collisions between objects of
> different types. While it's valuable to test git behavior for those cases, you
> probably want some way to explicitly test collisions that do not change
> the object type, as they're not trivial to detect.

Right, that's why I'm suggesting to make a longer truncation so that
you don't get accidental collisions, but can still find a few specific
ones for your testing.

24 bits is enough to make toy repositories. If you wanted to store a
real repository with the truncated sha1s, you might use 36 bits (that's
9 hex characters, which is enough for git.git to avoid any accidental
collisions). But you can still find a collision via brute force in 2^18
tries, which is not so bad.

I.e., something like:

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index 22b125cf8..9158e39ed 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -233,6 +233,10 @@ void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len)
 
 void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx)
 {
+	/* copy out only the first 36 bits */
+	static const uint32_t mask_bits[5] = {
+		0xffffffff, 0xf0000000
+	};
 	static const unsigned char pad[64] = { 0x80 };
 	unsigned int padlen[2];
 	int i;
@@ -247,5 +251,5 @@ void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx)
 
 	/* Output hash */
 	for (i = 0; i < 5; i++)
-		put_be32(hashout + i * 4, ctx->H[i]);
+		put_be32(hashout + i * 4, ctx->H[i] & mask_bits[i]);
 }
Build that and make it available as git.broken, and then feed your repo
into it, like:

  git init --bare fake.git
  git fast-export HEAD | git.broken -C fake.git fast-import

at which point you have an alternate-universe version of the repository,
which you can operate on as usual with your git.broken tool.

And then you can come up with collisions via brute force:

  # hack to convince hash-object to do lots of sha1s in a single
  # invocation
  N=300000
  for i in $(seq $N); do
    echo $i >$i
  done
  seq 300000 | git.broken hash-object --stdin-paths >hashes

  for collision in $(sort hashes | uniq -d); do
	grep -n $collision hashes
  done

The result is that "33713\n" and "170653\n" collide. So you can now add
those to your fake.git repository and watch the chaos ensue.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]