Re: SHA1 collisions found

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 02, 2017 at 11:55:45AM -0800, Linus Torvalds wrote:

> Anyway, I do have a suggestion for what the "object version" would be,
> but I'm not even going to mention it, because I want people to first
> think about the _concept_ and not the implementation.
> 
> So: What do you think about the concept?

I think it very much depends on what's in the "object version". :)

IMHO, we are best to consider sha1 "broken" and not count on any of its
bytes for cryptographic integrity. I know that's not really the case,
but it just makes reasoning about the whole thing simpler. So at that
point, it's pretty obvious that the "object version" is really just "an
integrity hash".

And that takes us full circle to earlier proposals over the years to do
something like this in the commit header:

  parent ...some sha1...
  parent-sha256 ...some sha256...

and ditto in tag headers, and trees obviously need to be hackily
extended as you described to carry the extra hash. And then internally
we continue to happily use sha1s, except you can check the
sha256-validity of any reference if you feel like it.

This is functionally equivalent to "just start using sha-256, but keep a
mapping of old sha1s to sha-256s to handle old references". The
advantage is that it makes the code part of the transition simpler. The
disadvantage is that you're effectively carrying a piece of that
sha1->sha256 mapping around in _every_ object.

And that means the same bits of mapping data are repeated over and over.
Git's pretty good at de-duplicating on the surface. So yeah, every tree
entry is now 256 bits larger, but deltas mean that we usually only end
up storing each entry a handful of times. But we still pay the price to
walk over the bytes every time we apply a delta, zlib inflate, parse the
tree, etc. The runtime cost of the transition is carried forward
forever, even for repositories that are willing to rewrite history, or
are created after the flag day.

So I dunno. Maybe I am missing something really clever about your
proposal. Reading the rest of the thread, it sounds like you had a
thought that we could get by with a very tiny object version, but the
hash-adding thing nixed that. If I'm still missing the point, please try
to sketch it out a bit more concretely, and I'll come back with my
thinking cap on.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]