Re: [PATCH] archive: Store checksum correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 23.07.19 um 18:49 schrieb Junio C Hamano:
> Matt Turner <mattst88@xxxxxxxxx> writes:
>
>> tar2sqfs (part of https://github.com/topics/tar2sqfs) rejects tarballs
>> made with git archive with the message
>>
>>     invalid tar header checksum!
>>
>> tar2sqfs recomputes the tarball's checksum to verify it. Its checksum
>> implementation agrees with GNU tar, which contains a comment that states
>>
>>     Fill in the checksum field.  It's formatted differently from the
>>     other fields: it has [6] digits, a null, then a space ...
>>
>> Correcting this allows tar2sqfs to correctly process the tarballs made
>> by git archive.
>>
>> Signed-off-by: Matt Turner <mattst88@xxxxxxxxx>
>> ---
>>  archive-tar.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/archive-tar.c b/archive-tar.c
>> index 3e53aac1e6..f9a157bfd1 100644
>> --- a/archive-tar.c
>> +++ b/archive-tar.c
>> @@ -215,7 +215,9 @@ static void prepare_header(struct archiver_args *args,
>>  	memcpy(header->magic, "ustar", 6);
>>  	memcpy(header->version, "00", 2);
>>
>> -	xsnprintf(header->chksum, sizeof(header->chksum), "%07o", ustar_header_chksum(header));
>> +	xsnprintf(header->chksum, sizeof(header->chksum), "%06o", ustar_header_chksum(header));
>> +	header->chksum[6] = '\0';
>> +	header->chksum[7] = ' ';
>
> Wow.  The choice of %07o is as old as the very original of tar-tree
> implementation in our codebase, starting at ae64bbc1 ("tar-tree:
> Introduce write_entry()", 2006-03-25).

Actually it's already in 731ab9ccf2 ("[PATCH] create tar archives of
tree on the fly", 2005-04-28).

> I think the updated behaviour matches Wikipedia [*1*] where it
> spells out that 6 octal is followed by a NUL and a SP; it also says
> various implementations do not adhere to this format---perhaps they
> meant us ;-)

OpenBSD's pax(1) does the same if I read
https://github.com/openbsd/src/blob/master/bin/pax/tar.c correctly.

> Frustratingly, I couldn't find this spelled out in POSIX [*2*] X-<.
> The closest I found there was
>
>     All other fields are leading zero-filled octal numbers using
>     digits from the ISO/IEC 646:1991 standard IRV. Each numeric
>     field is terminated by one or more <space> or NUL characters.
>
> which sounds as if the standard wanted to be deliberately more
> inclusive than what tar2sqfs is insisting on.

Yes, I remember following that specific paragraph when writing the
code.  The wording probably tries to account for the different
variants invented by vendors over the decades.

is_checksum_valid() in
https://github.com/AgentD/squashfs-tools-ng/blob/master/lib/tar/checksum.c
compares the formatted checksum byte-by-byte.  That seems
unnecessarily strict.  Parsing and comparing the numerical value
of the checksum would be more forgiving, better adhere to POSIX and
might be a tiny bit quicker.

René




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux