On Fri, Mar 31, 2017 at 06:05:17PM +0200, Lars Schneider wrote: > I just got a report with the following output after a "git fetch" operation > using Git 2.11.0.windows.3 [1]: > > remote: Counting objects: 5922, done. > remote: Compressing objects: 100% (14/14), done. > error: inflate: data stream error (unknown compression method) > error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header > fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 ! > fatal: index-pack failed > > I would be really surprised if we discovered a SHA1 collision in a production > repo. My guess is that this is somehow triggered by a network issue (see data > stream error). Any tips how to debug this? I'd be surprised, too. :) I'm not sure that inflate error actually comes from the network pack. The "unable to unpack $sha1 header" message actually comes from sha1_object_info_loose(). Which means we're failing to read our _local_ version of 6acd8f279a8b, which is an object we believe is coming in from the network. And that would explain the false-positive collision. We computed the sha1 on something come from the network. We believe we have an object with that sha1 already (but it's corrupted), and then when we compared the real bytes to the corrupted bytes, they didn't match. We should be able to confirm that by running "git fsck" on the local repo, which I'd expect to complain about the loose object. But what I find disturbing there is that we did not notice the failure when accessing the loose object. It seems to have been lost in the call chain. I think the problem is that sha1_loose_object_info() may report errors in two ways: returning -1 if it did not find the object, or putting OBJ_BAD into the type field if it found a corrupt object. But callers are not aware of the second one. I think it should probably return -1 for a corruption, too, and act as if we don't have the object (because we effectively don't). -Peff