This is something that has been bothering me for several weeks now. Waaaaaaay back Git was considered to be secure as it never overwrote an object it already had. This was ensured by always unpacking the packfile received over the network (both in fetch and receive-pack) and our already existing logic to not create a loose object for an object we already have. Lately however we keep "large-ish" packfiles on both fetch and push by running them through index-pack instead of unpack-objects. This would let an attacker perform a birthday attack. How? Assume the attacker knows a SHA-1 that has two different data streams. He knows the client is likely to have the "good" one. So he sends the "evil" variant to the other end as part of a "large-ish" packfile. The recipient keeps that packfile, and indexes it. Now since this is a birthday attack there is a SHA-1 collision; two objects exist in the repository with the same SHA-1. They have *very* different data streams. One of them is "evil". Currently the poor recipient cannot tell the two objects apart, short of by examining the timestamp of the packfiles. But lets say the recipient repacks before he realizes he's been attacked. We may wind up packing the "evil" version of the object, and deleting the "good" one. This is made *even more likely* by Junio's recent rearrange_packed_git patch (b867092f). SHA-1 is generally considered to be broken, as there have been some attacks implemented where a massive amount of garbage is injected into a comment, producing a source file that a compiler/interpreter can still process just fine, but that contains "evil bits of code" and has the same hash as a "non-evil" version of that same file. Yes, of course, if you look at the comment you would immediately realize its crap. You probably would even realize the file is crap just by looking at the file size, as typically several megabytes of garbage is required. But how likely are you to look at a file content, or even size, during say git-bisect? Especially on a large project? Would you really notice that "usb.c" took 3 seconds longer than normal to compile because the preprecessor had to wade through a gigantic garbage comment? We broke a fundemental assumption in the Git security model, and I don't think anyone blinked. Oops. Either the SHA-1 birthday attack I just described is still thought to be a non-issue for at least the next few years (due to current computing power limitations), or we all missed that one, big time. The fix does appear to be simple. Just don't write the existing object to the output packfile. But really that is a lot more like what unpack-objects does: buffer deltas we cannot resolve yet, and only write out what we cannot find through our ODB. The logic in index-pack ain't built for that... For those that are really paranoid about this, you can disable the pack keeping by setting transfer.unpackLimit to a *huge* value, one that is far larger than any packfile you might receive. -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html