On Thu, Sep 05, 2024 at 01:04:34PM -0400, Taylor Blau wrote: > On Thu, Sep 05, 2024 at 09:51:00AM -0700, Junio C Hamano wrote: > > Taylor Blau <me@xxxxxxxxxxxx> writes: > > > > > If so, I agree, but would note that this series does not yet switch > > > index-pack to use the non-collision detecting SHA-1 implementation when > > > available, so that would not be a prerequisite for merging this series. > > > > Hmph, I am confused. It needs to be corrected in order to address > > collisions of the tail sum Peff raised, as no longer checked the > > tail sum with SHA1DC but with "fast" SHA-1. > > Peff's mail supposes that we have already modified index-pack to use the > non-collision detecting SHA-1 implementation. But this series does not > do that, so I don't think we have an issue to address here. > > In a hypothetical future series where we do modify index-pack to use the > _FAST SHA-1 implementation, then we would need to address the issue that > Peff raised first as a prerequisite. I verified that this was the case by applying only the following to my series: --- 8< --- diff --git a/sha1/openssl.h b/sha1/openssl.h index 1038af47da..f0d5c59c43 100644 --- a/sha1/openssl.h +++ b/sha1/openssl.h @@ -32,6 +32,8 @@ static inline void openssl_SHA1_Final(unsigned char *digest, { EVP_DigestFinal_ex(ctx->ectx, digest, NULL); EVP_MD_CTX_free(ctx->ectx); + memset(digest, 0, 19); + digest[19] &= 0x3; } static inline void openssl_SHA1_Clone(struct openssl_SHA1_CTX *dst, --- >8 --- and then creating a victim.git repository (which in my case was born from git.git) and then repacking to produce the following state: $ ls -la victim.git/objects/pack total 262704 drwxr-xr-x 2 ttaylorr ttaylorr 4096 Sep 5 13:45 . drwxr-xr-x 4 ttaylorr ttaylorr 4096 Sep 5 13:46 .. -r--r--r-- 1 ttaylorr ttaylorr 3306804 Sep 5 13:45 pack-0000000000000000000000000000000000000003.bitmap -r--r--r-- 1 ttaylorr ttaylorr 15588224 Sep 5 13:44 pack-0000000000000000000000000000000000000003.idx -r--r--r-- 1 ttaylorr ttaylorr 247865480 Sep 5 13:44 pack-0000000000000000000000000000000000000003.pack -r--r--r-- 1 ttaylorr ttaylorr 2226788 Sep 5 13:44 pack-0000000000000000000000000000000000000003.rev Then I set up an "evil" repository like in Peff's recipe and started repeatedly pushing. fsck is slow here, so the loop is just "while true", but it doesn't matter that we're not fscking the victim repository since I'll show in a second that it's not corrupted to begin with. Running this loop: $ while true do ls -l ../victim.git/objects/pack/ git.compile commit --allow-empty -m foo git.compile push ../victim.git HEAD:foo done $ ls -l ../victim.git/objects/pack/ , fails very quickly and produces the following: [main 727346d] foo Enumerating objects: 12, done. Counting objects: 100% (12/12), done. Delta compression using up to 20 threads Compressing objects: 100% (11/11), done. Writing objects: 100% (12/12), 779 bytes | 779.00 KiB/s, done. Total 12 (delta 10), reused 0 (delta 0), pack-reused 0 (from 0) remote: fatal: final sha1 did not match error: remote unpack failed: unpack-objects abnormal exit To ../victim.git ! [remote rejected] HEAD -> foo (unpacker error) error: failed to push some refs to '../victim.git' The victim repository rightly rejects the push, since even though the evil repository generated a pack with a colliding checksum value, the victim repository validated it using the collision-detecting / non-broken SHA-1 implementation and rejected the pack appropriately. Of course, if index-pack were updated to use the non-collision detecting implementation of SHA-1 when compiled using one of the _FAST knobs, *and* we did blindly updated index-pack to use the _fast variants without doing anything else in Peff's mail, then we would have corruption. But I think the point of Peff's mail is to illustrate that this is only a problem in a world where index-pack uses the _fast SHA-1 implementation, but does not have any additional protections in place. Thanks, Taylor