On Fri, Mar 24, 2017 at 11:37:54PM -0700, Junio C Hamano wrote: > The hash that names a packfile is constructed by sorting all the > names of the objects contained in the packfile and running SHA-1 > hash over it. I think this MUST be hashed with collision-attack > detection. A malicious site can feed you a packfile that contains > objects the site crafts so that the sorted object names would result > in a collision-attack, ending up with one pack that contains a sets > of objects different from another pack that happens to have the same > packname, causing Git to say "Ah, this new pack must have the same > set of objects as the pack we already have" and discard it, > resulting in lost objects and a corrupt repository with missing > objects. I don't think this case really matters for collision detection. What's important is what Git does when it receives a brand-new packfile that would overwrite an existing one. It _should_ keep the old one, under the usual "existing data wins" rule. It should be easy to test, though: $ git init tmp && cd tmp $ git commit --allow-empty -m foo $ git gc $ touch -d yesterday .git/objects/pack/* $ ls -l .git/objects/pack -r--r--r-- 1 peff peff 1128 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.idx -r--r--r-- 1 peff peff 153 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.pack $ git index-pack --stdin <.git/objects/pack/*.pack pack 7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b $ ls -l .git/objects/pack -r--r--r-- 1 peff peff 1128 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.idx -r--r--r-- 1 peff peff 153 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.pack Looks like the timestamps were retained. <phew> And if we use strace, we can see what happens: $ strace git index-pac k--stdin <.git/objects/pack/*.pack link(".git/objects/pack/tmp_pack_YSrdWU", ".git/objects/pack/pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.pack") = -1 EEXIST (File exists) unlink(".git/objects/pack/tmp_pack_YSrdWU") = 0 link(".git/objects/pack/tmp_idx_O94NNU", ".git/objects/pack/pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.idx") = -1 EEXIST (File exists) unlink(".git/objects/pack/tmp_idx_O94NNU") = 0 This is due to the link()/EEXIST handling in finalize_object_file. It has a FIXME for a collision check, so we could actually detect at that point whether we have a real collision, or if the other side just happened to send us the same pack. I wouldn't be surprised if the dumb-http walker is not so careful, though (but the solution is to make it careful, not to worry about a weak hash algorithm). The rest of your email all made sense to me. -Peff