Re: [PATCH] sha1_file: remove recursion in packed_object_info

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



thomas <trast@xxxxxxxxxxxxxxx> writes:

> Junio C Hamano <gitster@xxxxxxxxx> writes:
>
>> The following comment is also lost but...
>>
>>> -	/* We choose to only get the type of the base object and
>>> -	 * ignore potentially corrupt pack file that expects the delta
>>> -	 * based on a base with a wrong size.  This saves tons of
>>> -	 * inflate() calls.
>>> -	 */
>>> -	if (sizep) {
>>> -		*sizep = get_size_from_delta(p, w_curs, curpos);
>>> -		if (*sizep == 0)
>>> -			type = OBJ_BAD;
>>
>> ... is this check correct?  There is an equivalent check at the
>> beginning of the new packed_object_info() to error out a deltified
>> result.  Why is an object whose size is 0 bad?
>
> Cc'ing Nicolas, but I think there are several reasons:
>
> If it's a delta, then according to docs[1] it starts with the SHA1 of
> the base object, plus the deflated data.  So it is at least 20 bytes.

get_size_from_delta() grabs the size, the number you would get in
the third parameter of read_sha1_file(), of the result of applying
the delta we are looking at.  The part that stores this information
is called the "compressed delta data" in the document you are
looking at.

The function you want to look at is patch_delta(), where it grabs
two such sizes from the delta stream with get_delta_hdr_size().

A delta stream begins with:

    * preimage length, expressed as a 7-bit-per-byte varint;
    * postimage length, expressed as a 7-bit-per-byte varint;

followed by number of records, each prefixed by a command byte.

    * Command byte with its 8th bit set records source offset and
      size (max 32 and 24 bits, respectively---other 7 bits in the
      command byte tells us how large the offset and size are) and
      tells us to insert a copy of that region at the current point.

    * Command byte between 1-127 (inclusive) tells us to add that
      many bytes that follow the command byte from the delta stream
      at the current point.

    * Command byte 0 is an error.

And get_size_from_delta() skips the preimage length, grabs postimage
length and returns the latter.  It is how we decide how many bytes
we need to allocate to hold the result of applying the delta.

> If it's not a delta, then it must start with '<type> <size>\0', which
> even after compression cannot possibly be 0 bytes.
>
> Either way, get_size_from_delta() also uses 0 as the error return.

Yes, that is why I said "is this check correct?".  As I already
said, I think the only two things that protects us from creating a
delta whose postimage size is 0 are the fact that we do not even
attempt to deltify anything smaller than 50 bytes in pack-objects,
and create_delta() refuses to create a delta to produce an empty
postimage.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]