Re: SHA1 collision in production repo?! (probably not)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 31, 2017 at 06:05:17PM +0200, Lars Schneider wrote:

> I just got a report with the following output after a "git fetch" operation
> using Git 2.11.0.windows.3 [1]:
> 
> remote: Counting objects: 5922, done.
> remote: Compressing objects: 100% (14/14), done.
> error: inflate: data stream error (unknown compression method)
> error: unable to unpack 6acd8f279a8b20311665f41134579b7380970446 header
> fatal: SHA1 COLLISION FOUND WITH 6acd8f279a8b20311665f41134579b7380970446 !
> fatal: index-pack failed
> 
> I would be really surprised if we discovered a SHA1 collision in a production
> repo. My guess is that this is somehow triggered by a network issue (see data
> stream error). Any tips how to debug this?

I'd be surprised, too. :)

I'm not sure that inflate error actually comes from the network pack.
The "unable to unpack $sha1 header" message actually comes from
sha1_object_info_loose(). Which means we're failing to read our _local_
version of 6acd8f279a8b, which is an object we believe is coming in from
the network.

And that would explain the false-positive collision. We computed the
sha1 on something come from the network. We believe we have an object
with that sha1 already (but it's corrupted), and then when we compared
the real bytes to the corrupted bytes, they didn't match.

We should be able to confirm that by running "git fsck" on the local
repo, which I'd expect to complain about the loose object.

But what I find disturbing there is that we did not notice the failure
when accessing the loose object. It seems to have been lost in the call
chain. I think the problem is that sha1_loose_object_info() may report
errors in two ways: returning -1 if it did not find the object, or
putting OBJ_BAD into the type field if it found a corrupt object. But
callers are not aware of the second one.

I think it should probably return -1 for a corruption, too, and act as
if we don't have the object (because we effectively don't).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]