Shawn Pearce <spearce@xxxxxxxxxxx> writes: > Junio C Hamano <junkio@xxxxxxx> wrote: >> It might be worthwhile to disable revalidate reused objects >> individually and instead scan and checksum the entire .pack file >> when the number of objects being reused exceeds certain >> threshold, relative to the number of objects in existing pack, >> perhaps. > > Correct me if I'm wrong but didn't this revalidate check happen > because the SHA1 of the pack was correct but there was a bad bit > in the zlib stream? There are two more or less independent problems and you are correct to point out that summing the entire .pack does not catch one class of problem while it does catch the other. The Linus's theory goes like this: (1) A bit in an existing pack was damaged somehow. It might have happened on the mothership machine when it was first created, or after it was read and copied to the notebook via rsync. (2) A 'repack -a -d' was done in a repository that had that damanged pack, it decided to reuse a deflated object from the existing, damaged pack, and copied that deflated representation into a newly created pack without validating. (3) The 'repack -a -d', when finishing the newly created pack, computed a checksum over the whole new pack and wrote the SHA-1 checksum in it. Now, sha1_file() has check_packed_git_idx() that makes sure .idx file is not corrupted; it runs the checksum over the whole .idx data when it first maps in and makes sure the sum matches what is recorded at the end. So if the corruption in (1) happened to the .idx file, 'repack -a -d' in (2) would have noticed and refused to use the corresponding .pack. The problem is that there is however no corresponding "checksum over the whole file before use" for .pack file during a normal operation of use_packed_git(). Otherwise we would have caught the corruption in the existing pack and prevented step (2) from propagating the error. The idea proposed by Linus and implemented in the patch in this thread is to mitigate this by revalidating the individual pieces we copy in (2). If we copy out most of what is in the existing pack, however, it may be cheaper to do the "whole file checksum before use" instead. On the other hand, the "alpha particle at the right moment" theory goes like this: (1) write_object() gave the buffer to sha1write_compressed(); (2) sha1write_compressed() asked zlib to deflate the data and received the result in buffer pointed by void *out; (3) bit flipped in that buffer, after zlib finished writing into it with the CRC; (4) sha1write() wrote out the now-corrupt buffer, while computing the correct checksum for the end result (which is now corrupt). This breakage will not be caught unless we revalidate everything we copy out to the new pack. -- VGER BF report: U 0.5 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html