On Mon, Apr 28, 2008 at 12:34 PM, Daniel Barkalow <barkalow@xxxxxxxxxxxx> wrote: > On Mon, 28 Apr 2008, Henrik Austad wrote: > > > Hi list! > > > > As far as I have gathered, the SHA-1-sum is used as a identifier for commits, > > and that is the primary reason for using sha1. However, several places > > (including the google tech-talk featuring Linus himself) states that the id's > > are cryptographically secure. > > > > As discussed in [1], SHA-1 is not as secure as it once was (and this was in > > 2005), and I'm wondering - are there any plans for migrating to another > > hash-algorithm? I.e. SHA-2, whirlpool.. > > No. The cryptographic security we care about is that it's impractical to > come up with another set of content that hashes to the same value as a > given set of content. The known attacks on SHA-1 (and more broken earlier > hashes in the same general class) only allow the attacker to produce two > files that will collide. Now, it's true that this would allow somebody to > produce a commit where some people see the "good" blob and some people see > the "evil" blob, but (a) the "good" blob contains some large chunk of > random data, which is a major red flag by itself, and (b) all of these > people have to be taking data from the attacker. > > If somebody gives you some source, and it's got some large random chunk in > it, and the behavior of the object depends on the content of this chunk, > and it's unspecified where this chunk comes from, you should be aware > that they might be able to swap this chunk for a different chunk. But such > a file is pretty blatantly malicious anyway. This argument is invalid, since the use of git is not limited to source code. People can and do store unreadable binary data in git, and unless you are completely sure that no one would ever care about the security of that data in a way that can be attacked with a single collision, git should be secure about those as well. For example, I just converted a 20 GB repository to git which, among other things, contains pdf files of my tax returns. I have looked them over, but I have not opened them in a hex editor and looked them over at the binary level, and I don't think git should expect me to. Incidentally, git was the only version control system I tried except for subversion that didn't choke on that repository. Mercurial looked at my file renames and expanded the size past 45 GB before I killed it, I had to fix a several bugs in the bazaar conversion scripts before I realized it was just too slow, and svk turns out to be even more like the Antichrist than subversion itself is (mirroring N repository copies requires an N-fold increase in size). Geoffrey -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html