On Fri, Mar 31, 2017 at 10:35:06AM -0700, Junio C Hamano wrote: > Lars Schneider <larsxschneider@xxxxxxxxx> writes: > > > Hi, > > > > I just got a report with the following output after a "git fetch" operation > > using Git 2.11.0.windows.3 [1]: > > > > remote: Counting objects: 5922, done. > > remote: Compressing objects: 100% (14/14), done. > > error: inflate: data stream error (unknown compression method) > > error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header > > fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 ! > > fatal: index-pack failed > > > > I would be really surprised if we discovered a SHA1 collision in a production > > repo. My guess is that this is somehow triggered by a network issue (see data > > stream error). Any tips how to debug this? > > Perhaps the first thing to do is to tweak the messages in builtin/index-pack.c > to help you identify which one of identical 5 messages is firing. > > My guess would be that the code saw an object that came over the > wire, hashed it to learn that its object name is > 6acd8f279a8b20311665f41134579b7380970446, noticed that it locally > already has the object with the same name, and tried to make sure > they have identical contents (otherwise, what came over the wire is > a successful second preimage attack). But your loose object on disk > you already had was corrupt and didn't inflate correctly when > builtin/index-pack.c::compare_objects() or check_collision() tried > to. The code saw no data, or truncated data, or > whatever---something different from what the other data that hashed > down to 6acd8..., and reported a collision when there is no > collision. My guess is the one in compare_objects(). The "unable to unpack" error comes from sha1_loose_object_info(). We'd normally then follow up with read_sha1_file(), which would generate its own set of errors. But if open_istream() got a bogus value for the object size (but didn't correctly report an error), then it would probably quietly return 0 bytes from read_istream() later. I suspect this may improve things, but I haven't dug deeper to see if there are unwanted side effects, or if there are other spots that need similar treatment. diff --git a/sha1_file.c b/sha1_file.c index 43990dec7..38411f90b 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2952,7 +2952,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, if (status && oi->typep) *oi->typep = status; strbuf_release(&hdrbuf); - return 0; + return status; } int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, unsigned flags) -Peff