Re: My git repo is broken, how to fix it ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 20 Mar 2007, Alexander Litvinov wrote:
> 
> Actualy, I have packed that objects already, so fsck warn me:
> $ git fsck --full
> error: packed 8edc906985f00cf27180b1d9d4c3217ffd1896f8 from .git/objects/pack/pack-abc5cbabfc05c213e50c43ea07f43158bf1de236.pack is corrupt
> error: packed f6aca57bb30a12e9ac5d71558e0b6052d6fb67a8 from .git/objects/pack/pack-abc5cbabfc05c213e50c43ea07f43158bf1de236.pack is corrupt
> sha1 mismatch 8edc906985f00cf27180b1d9d4c3217ffd1896f8
> sha1 mismatch f6aca57bb30a12e9ac5d71558e0b6052d6fb67a8

Ok, this is different from what I expected. 

Since your pack-file seems to pass its own internal SHA1 checks, it means 
that it was likely corrupt already when it was written out in the pack. 
What's interesting is that it seems to unpack, but then the SHA1 of the 
unpacked object doesn't match.

The reason I say that's interesting is that it would seem to mean that the 
zlib CRC/adler check didn't trigger - which probably means that the object 
was corrupted *before* it was compressed (but after it was originally 
SHA1-summed), or the compression itself was corrupting (eg a libz 
problem).

And since the SHA1 of the pack-file matches, the thing was apparently 
also written out "correctly" after compression (but by that "correctly" I 
obviously mean that the *corrupted* data was written out). 

Sadly, by the time it's in a pack-file, it is *really* hard to figure out 
what went wrong: I see your unpacked data, but it's really the packed raw 
objects that I wanted to look at, in case there would be some pattern in 
the actual corruption (the corruption will then result in random crud when 
actually unpacking, which is why the unpacked data isn't that interesting, 
simple because there's no pattern left to analyze - it got inflated to 
bogus "data").

> I also use autocrlf feature:
> $ git config core.autocrlf
> true

I doubt autocrlf affects anything here, it's only used at checkin and 
checkout time, and it wouldn't affect the raw internal git objects.

More interesting might be if you might be using any of the other flags 
that actually affect internal git object packing: "use_legacy_headers" in 
particular? If we have a bug there, that could be nasty.

But to really look at this we should probably add a "really_careful" flag 
that actually re-verifies the SHA1 on read so that we'd catch these kinds 
of corruptions early. 

> This files are cpp code from our project and tham need to be private. Really.

Ok, no problem. I added back the git list (but not your attachments, 
obviously) but as explained above, there is not a lot I can do with the 
unpacked data, I'd like to see the actual "raw" stuff.

I'm hoping somebody has any ideas. We really *could* check the SHA1 on 
each read (and slow down git a lot) and that would catch corruption much 
faster and hopefully pinpoint it more quickly where exactly it happens. 
But maybe somebody has some other smart idea?

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]