Re: SHA1 collisions found

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just swap in md5 in place of sha1.  Pad with '0'.  That'll give you
all the collisions you want and none of those you don't want.

On Mon, Feb 27, 2017 at 5:43 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Mon, Feb 27, 2017 at 10:57:37AM +0100, Geert Uytterhoeven wrote:
>
>> > Yeah, that is a lot more flexible for experimenting. Though I'd think
>> > you'd probably want more than 4 bits just to avoid accidental
>> > collisions. Something like 24 bits gives you some breathing space (you'd
>> > expect a random collision after 4096 objects), but it's still easy to
>> > do a preimage attack if you need to.
>>
>> Just shortening the hash causes lots of collisions between objects of
>> different types. While it's valuable to test git behavior for those cases, you
>> probably want some way to explicitly test collisions that do not change
>> the object type, as they're not trivial to detect.
>
> Right, that's why I'm suggesting to make a longer truncation so that
> you don't get accidental collisions, but can still find a few specific
> ones for your testing.
>
> 24 bits is enough to make toy repositories. If you wanted to store a
> real repository with the truncated sha1s, you might use 36 bits (that's
> 9 hex characters, which is enough for git.git to avoid any accidental
> collisions). But you can still find a collision via brute force in 2^18
> tries, which is not so bad.
>
> I.e., something like:
>
> diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
> index 22b125cf8..9158e39ed 100644
> --- a/block-sha1/sha1.c
> +++ b/block-sha1/sha1.c
> @@ -233,6 +233,10 @@ void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len)
>
>  void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx)
>  {
> +       /* copy out only the first 36 bits */
> +       static const uint32_t mask_bits[5] = {
> +               0xffffffff, 0xf0000000
> +       };
>         static const unsigned char pad[64] = { 0x80 };
>         unsigned int padlen[2];
>         int i;
> @@ -247,5 +251,5 @@ void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx)
>
>         /* Output hash */
>         for (i = 0; i < 5; i++)
> -               put_be32(hashout + i * 4, ctx->H[i]);
> +               put_be32(hashout + i * 4, ctx->H[i] & mask_bits[i]);
>  }
> Build that and make it available as git.broken, and then feed your repo
> into it, like:
>
>   git init --bare fake.git
>   git fast-export HEAD | git.broken -C fake.git fast-import
>
> at which point you have an alternate-universe version of the repository,
> which you can operate on as usual with your git.broken tool.
>
> And then you can come up with collisions via brute force:
>
>   # hack to convince hash-object to do lots of sha1s in a single
>   # invocation
>   N=300000
>   for i in $(seq $N); do
>     echo $i >$i
>   done
>   seq 300000 | git.broken hash-object --stdin-paths >hashes
>
>   for collision in $(sort hashes | uniq -d); do
>         grep -n $collision hashes
>   done
>
> The result is that "33713\n" and "170653\n" collide. So you can now add
> those to your fake.git repository and watch the chaos ensue.
>
> -Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]