Re: [PATCH] pack-objects: re-validate data we copy from elsewhere.

Junio C Hamano <junkio@xxxxxxx> · Sat, 02 Sep 2006 02:42:20 -0700

Shawn Pearce <spearce@xxxxxxxxxxx> writes:

> Junio C Hamano <junkio@xxxxxxx> wrote:
>> It might be worthwhile to disable revalidate reused objects
>> individually and instead scan and checksum the entire .pack file
>> when the number of objects being reused exceeds certain
>> threshold, relative to the number of objects in existing pack,
>> perhaps.
>
> Correct me if I'm wrong but didn't this revalidate check happen
> because the SHA1 of the pack was correct but there was a bad bit
> in the zlib stream?

There are two more or less independent problems and you are
correct to point out that summing the entire .pack does not
catch one class of problem while it does catch the other.

The Linus's theory goes like this:

 (1) A bit in an existing pack was damaged somehow.  It might have
     happened on the mothership machine when it was first created,
     or after it was read and copied to the notebook via rsync.

 (2) A 'repack -a -d' was done in a repository that had that
     damanged pack, it decided to reuse a deflated object from the
     existing, damaged pack, and copied that deflated
     representation into a newly created pack without validating.

 (3) The 'repack -a -d', when finishing the newly created pack,
     computed a checksum over the whole new pack and wrote the
     SHA-1 checksum in it.

Now, sha1_file() has check_packed_git_idx() that makes sure .idx
file is not corrupted; it runs the checksum over the whole .idx
data when it first maps in and makes sure the sum matches what
is recorded at the end.  So if the corruption in (1) happened to
the .idx file, 'repack -a -d' in (2) would have noticed and
refused to use the corresponding .pack.

The problem is that there is however no corresponding "checksum
over the whole file before use" for .pack file during a normal
operation of use_packed_git().  Otherwise we would have caught
the corruption in the existing pack and prevented step (2) from
propagating the error.

The idea proposed by Linus and implemented in the patch in this
thread is to mitigate this by revalidating the individual pieces
we copy in (2).  If we copy out most of what is in the existing
pack, however, it may be cheaper to do the "whole file checksum
before use" instead.

On the other hand, the "alpha particle at the right moment"
theory goes like this:

 (1) write_object() gave the buffer to sha1write_compressed();

 (2) sha1write_compressed() asked zlib to deflate the data and
     received the result in buffer pointed by void *out;

 (3) bit flipped in that buffer, after zlib finished writing
     into it with the CRC;

 (4) sha1write() wrote out the now-corrupt buffer, while
     computing the correct checksum for the end result (which is
     now corrupt).

This breakage will not be caught unless we revalidate everything
we copy out to the new pack.

-- 
VGER BF report: U 0.5
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html