Am 23.07.19 um 18:49 schrieb Junio C Hamano: > Matt Turner <mattst88@xxxxxxxxx> writes: > >> tar2sqfs (part of https://github.com/topics/tar2sqfs) rejects tarballs >> made with git archive with the message >> >> invalid tar header checksum! >> >> tar2sqfs recomputes the tarball's checksum to verify it. Its checksum >> implementation agrees with GNU tar, which contains a comment that states >> >> Fill in the checksum field. It's formatted differently from the >> other fields: it has [6] digits, a null, then a space ... >> >> Correcting this allows tar2sqfs to correctly process the tarballs made >> by git archive. >> >> Signed-off-by: Matt Turner <mattst88@xxxxxxxxx> >> --- >> archive-tar.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/archive-tar.c b/archive-tar.c >> index 3e53aac1e6..f9a157bfd1 100644 >> --- a/archive-tar.c >> +++ b/archive-tar.c >> @@ -215,7 +215,9 @@ static void prepare_header(struct archiver_args *args, >> memcpy(header->magic, "ustar", 6); >> memcpy(header->version, "00", 2); >> >> - xsnprintf(header->chksum, sizeof(header->chksum), "%07o", ustar_header_chksum(header)); >> + xsnprintf(header->chksum, sizeof(header->chksum), "%06o", ustar_header_chksum(header)); >> + header->chksum[6] = '\0'; >> + header->chksum[7] = ' '; > > Wow. The choice of %07o is as old as the very original of tar-tree > implementation in our codebase, starting at ae64bbc1 ("tar-tree: > Introduce write_entry()", 2006-03-25). Actually it's already in 731ab9ccf2 ("[PATCH] create tar archives of tree on the fly", 2005-04-28). > I think the updated behaviour matches Wikipedia [*1*] where it > spells out that 6 octal is followed by a NUL and a SP; it also says > various implementations do not adhere to this format---perhaps they > meant us ;-) OpenBSD's pax(1) does the same if I read https://github.com/openbsd/src/blob/master/bin/pax/tar.c correctly. > Frustratingly, I couldn't find this spelled out in POSIX [*2*] X-<. > The closest I found there was > > All other fields are leading zero-filled octal numbers using > digits from the ISO/IEC 646:1991 standard IRV. Each numeric > field is terminated by one or more <space> or NUL characters. > > which sounds as if the standard wanted to be deliberately more > inclusive than what tar2sqfs is insisting on. Yes, I remember following that specific paragraph when writing the code. The wording probably tries to account for the different variants invented by vendors over the decades. is_checksum_valid() in https://github.com/AgentD/squashfs-tools-ng/blob/master/lib/tar/checksum.c compares the formatted checksum byte-by-byte. That seems unnecessarily strict. Parsing and comparing the numerical value of the checksum would be more forgiving, better adhere to POSIX and might be a tiny bit quicker. René