On Sun, 27 Aug 2006, Linus Torvalds wrote: > Now, "git repack -a" is obviously special in that the "-a" will mean that > we will generally touch all of the old pack _anyway_, and thus verifying > the signature is no longer at all as unreasonable as it is under other > circumstances. And very arguably, if you _also_ do "-d", then since that > is a fairly dangerous operation with the potential for real data loss, you > could well argue that we should do it. We definitely should do it with -d. > However, since the data was _already_ corrupt in that situation, and since > a "git-fsck-objects --full" _will_ pick up the corruption in that case > both before and after, equally arguably it's also true that there's really > not a huge advantage to checking it in "git repack -a -d". There really is an advantage. Given that the absence of -d leaves old objects/packs around, there is a greater chance for still finding an early copy of the bad object. > In other words, in your case, the reason you ended up with the corruption > spreading was _not_ that "git repack -a -d" might have silently not > noticed it, but really the fact that unison meant that the corruption > would spread from one archive to another in the first place. But -d would definitely delete old packs that could have had a non corrupted copy of the desired object. > Final note: a "git repack -a -d" normally actually _does_ do almost as > much checking as a "git-fsck-objects". It's literally just the "copy the > already packed object from an old pack to a new one" that it an > optimization that short-circuits all the normal git sanity checks. All the > other cases will effectively do a lot of integrity checking just by virtue > of unpacking the data in the object that is packed, before re-packing it. > > So it might well be the case that we should simply add an extra integrity > check to the raw data copy in builtin-pack-objects.c: write_object(). > > Now, these days there is actually two cases of that: the pack-to-pack copy > (which has existed for a long while) and the new "!legacy_loose_object()" > case. They should perhaps both verify the integrity of what they copy. I think that git-pack-object should grow another flag: --verify-src or something. That flag would force the verification of the pack checksum for any pack used for the repack operation. Then it should be used anytime -d is provided to git-repack. When -d is not provided then we can skip that --verify-src security measure since nothing gets deleted and therefore no risk of loosing a good object over a corrupted one would happen. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html