On Monday 28 April 2008 21:34:50 Daniel Barkalow wrote: > On Mon, 28 Apr 2008, Henrik Austad wrote: > > Hi list! > > > > As far as I have gathered, the SHA-1-sum is used as a identifier for > > commits, and that is the primary reason for using sha1. However, several > > places (including the google tech-talk featuring Linus himself) states > > that the id's are cryptographically secure. > > > > As discussed in [1], SHA-1 is not as secure as it once was (and this was > > in 2005), and I'm wondering - are there any plans for migrating to > > another hash-algorithm? I.e. SHA-2, whirlpool.. > > No. The cryptographic security we care about is that it's impractical to > come up with another set of content that hashes to the same value as a > given set of content. The known attacks on SHA-1 (and more broken earlier > hashes in the same general class) only allow the attacker to produce two > files that will collide. Now, it's true that this would allow somebody to > produce a commit where some people see the "good" blob and some people see > the "evil" blob, but (a) the "good" blob contains some large chunk of > random data, which is a major red flag by itself, and (b) all of these > people have to be taking data from the attacker. yes, I can see that point, but I was thinking more along the line of: 1) clone repo 2) add malicious code 3) add a huge block of comment, ifdef-block etc somewhere obscure in the code and keep adding random data untill hash matches a well-known release. 4) publish repo, or even worse, change central repo Most users, and probably a lot of developers never browse through the *entire* archive looking for this, and as long as the hash checks out - why would you? Yes, it would probably be discovered soon enough, but take the linux kernel as an example - if you get, say 100 infected machines due to this, what would this do to the reputation of the kernel? > If somebody gives you some source, and it's got some large random chunk in > it, and the behavior of the object depends on the content of this chunk, > and it's unspecified where this chunk comes from, you should be aware > that they might be able to swap this chunk for a different chunk. But such > a file is pretty blatantly malicious anyway. True, but this actually means you have to verify *everything*, even though the hash checks out. but yes, I can see your point, and it would most likely be infeasible to generate a collision using this approach, and changing to another hashfunction would probably not add much. basically I was just curious and played ahead with the idea. Thanks for the answer though :) -- mvh Henrik Austad
Attachment:
signature.asc
Description: This is a digitally signed message part.