Re: [PATCH 3/5] optimize parse_sha1_header() a little by detecting object type

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shawn O. Pearce wrote:
> Liu Yubao <yubao.liu@xxxxxxxxx> wrote:
>> diff --git a/sha1_file.c b/sha1_file.c
>> index dccc455..79062f0 100644
>> --- a/sha1_file.c
>> +++ b/sha1_file.c
>> @@ -1099,7 +1099,8 @@ static void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
>>  
>>  		if (!fstat(fd, &st)) {
>>  			*size = xsize_t(st.st_size);
>> -			map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0);
>> +			if (*size > 0)
>> +				map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0);
>>  		}
>>  		close(fd);
>>  	}
> 
> This has nothing to do with this change description.  Why are we
> returning NULL from map_sha1_file when the file length is 0 bytes?
> No loose object should ever be an empty file, there must always be
> some sort of type header present.  So it probably is an error to
> have a 0 length file here.  But that bug is a different change.
> 

Also a defensive programming for uncompressed loose object because the mapped memory
will be passed to parse_sha1_header() directly without being checked by inflateInit().

In fact unpack_sha1_header() in current code calls legacy_loose_object() without checking
mapsize first. If it encounters (very very unlikely) a corrupted empty loose object, it
will crash.

>> @@ -1257,6 +1258,8 @@ static int parse_sha1_header(const char *hdr, unsigned long length, unsigned lon
>>  	 * terminating '\0' that we add), and is followed by
>>  	 * a space, at least one byte for size, and a '\0'.
>>  	 */
>> +	if ('b' != *hdr && 'c' != *hdr && 't' != *hdr)	/* blob/commit/tag/tree */
>> +		return -1;
>>  	i = 0;
>>  	while (hdr < hdr_end - 2) {
>>  		char c = *hdr++;
> 
> Oh.  I wouldn't do that.  Its a cute trick and it works to quickly
> determine if the header is an uncompressed header vs. a zlib header
> vs. a new-style loose object header (which git cannot write anymore,
> but it still can read).  But its just asking for trouble when/if a
> new object type was ever added to the type table.
> 
I can't agree any more, it's just a trick. I considered adding
a function seems_likely_uncompressed_loose_object(), but I didn't
because this patch series are just my first try, I don't know whether
the idea to support uncompressed loose object is attractive enough.

> Given that we know that no type name can be more than 10 bytes and
> if you use my patch from earlier today you can be certain hdr has a
> '\0' terminator, so you could write a function to test for the type
> against the hdr, stopping on either ' ' or '\0'.  Or find the first
> ' ' in the first 10 bytes (which is what this loop does anyway) and
> then test that against the type name table.
> 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux