On Wed, Nov 30, 2011 at 04:41:15PM -0800, Bill Zaumen wrote: > Aside from that, suppose the attacker does what you suggests (providing > a valid CRC so that git commands like verify-pack don't have an error > to detect). You can't tell that something is wrong, but Linus can - > the next time he does a fetch. If he fetches, the server sends > some SHA-1 hashes and the client responds with 'have' or 'want' in a > reply. If the client wants it, the client doesn't have a CRC, if the > client sends 'have', the CRCs are available so those get sent. The > server then checks, notices a mismatch (probably in the CRC of the CRCs > of all the blobs in the commit's tree), and generates an error. OK. I thought your claim was that this would help detect collision-related forgeries in the chain of hashes while verifying a signed tag. But your claim is only that your scheme gives people who already have the objects an additional chance to notice discrepencies in what the server is serving and what they have. And then they can notify other people of the problem in an out-of-band way (i.e., telling kernel.org admins that the system is hacked, or telling users not to fetch from there). Cryptographically speaking, I think that claim is sound, and you can certainly construct attack scenarios where this detection would help. However, quantifying the effectiveness is difficult. What is the likelihood of a malicious collided-object replacement being detected without your scheme? What is it with it? There are many questions that factor into guessing the latter. How often does Linus fetch from his own kernel.org repository? He would usually push, I would think. Even if he does fetch, he wouldn't be getting the old objects that he already has. I guess this is the reason for your digest-of-digests for each commit object sent? But what about objects that are no longer in current commits, but are in older ones? E.g., v1.0 of foo.c has sha1 X, and an attacker finds a collision and replaces it. Meanwhile, the tip of development now has replaced foo.c with X'. When Linus talks to the server, git will never care about X (it is neither being transmitted, nor is part of a commit that is being transmitted). But people may still be cloning and using the v1.0 tag, assuming it is valid. What about the server being more clever about hiding the replacement object? E.g., instead of just breaking into kernel.org and inserting a replacement object, the attacker runs a malicious git-daemon that returns the bogus object to cloners, but the real object to fetchers. Thus there is nothing for Linus to detect, but new cloners get the malicious object. Or you could give the bogus object to people who are getting the object for the first time (since they presumably have no crc for that object), but otherwise use the real object. You could get around this deception by pretending to be a "victim"; i.e., cloning fresh and seeing what the server gives you, comparing it to your known-good repository. Which is similar to what you suggest here: > It's also possible to write some additional commands to (for example) > fetch the SHA-1 hashes and CRCs from all remote repositories you use > and compare these to make sure they are all consistent, something that > can be run ocassionally. But we can already do that. Assume you have an existing repo "foo". To verify the copy at git://example.com/foo.git, do a fresh clone to "bar", and then compare the objects in "foo" to "bar", either byte-wise or by digest. > That result for birthday attack assumes the goal is to find two files > that will have the same SHA-1 value (or SHA-1 + CRC). The case I was > talking about (apologies if I did not state this clearly) is that you > have an existing object and an existing CRC and you want to generate > a second object with the same SHA-1 and same CRC as the first. A > birthday attack doesn't work so well in that case - the number of tries > is much higher than half the number of bits in the digest. Yes. This is called a "pre-image" attack (to be more specific, it is a "second pre-image attack", since you have the original object). And AFAIK, it's not feasible against SHA-1, nor will it be in the near future (even MD5, which is considered totally broken, does not have feasible pre-image attacks). > http://en.wikipedia.org/wiki/Birthday_attack#Digital_signature_susceptibility > has a discussion regarding digital signatures. The trick is for a > person to create a large number of variations of a "fair" and "unfair" > contract, and use a birthday attack to find a pair that have the same > hash. The variations are typically inconsequential changes (extra > spacing, commas, etc.) Right. This is the classic birthday attack. I don't keep up with the state of the art in collision attacks these days, but I think in practice they are usually executed by adding arbitrary binary garbage into a "hidden" spot in a file. Which makes it much harder to execute against text files like source code (as opposed to, say, a binary firmware blob). IIRC, the practical MD5 attacks involved postscript documents (with non-printing garbage sections that printers would ignore). > In the case I was discussing, a developer creates some code, commits > it and pushes it to a shared repository - the developer is not given > code by the attacker. The attacker can, however, see the code by > fetching it. An attack then consists of generating a collision, > change the object in the attacker's local repository, and then push > the original developer's commit (with the modified object) to another > shared repository before someone else puts the correct objects into > that repository. A birthday attack does not work in this case. Yeah, that is simply not feasible, and is not likely to become so anytime soon. > There one issue that this suggests however - it is not clear if the > 2^57 number given for the best SHA-1 attacks were attempts to generate > a new file with the same SHA-1 hash as an existing file or a pair of > files that have the same SHA-1 hash. If it is the latter, then an > attack is significantly harder than I assumed as a worst case, but > still possibly much, much better than brute force. The 2^57 number is for a collision of two new objects, not for a pre-image attack. AFAIK, there are currently no pre-image attacks on SHA-1 at all. I don't think there's a need to response individually to the points in the rest of your email; they're all based on the assumption that the attacker is replacing a known-good written-by-Linus commit with one found in a pre-image attack. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html