Re: [PATCH 1/2] archive-tar: write extended headers for file sizes >= 8GB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 21.06.2016 um 17:59 schrieb Jeff King:
> On Tue, Jun 21, 2016 at 12:54:11AM +0200, René Scharfe wrote:
> 
>>> Unfortunately, it's quite an expensive test to run. For one
>>> thing, unless your filesystem supports files with holes, it
>>> takes 64GB of disk space (you might think piping straight to
>>> `hash-object --stdin` would be better, but it's not; that
>>> tries to buffer all 64GB in RAM!). Furthermore, hashing and
>>> compressing the object takes several minutes of CPU time.
>>>
>>> We could ship just the resulting compressed object data as a
>>> loose object, but even that takes 64MB. So sadly, this code
>>> path remains untested in the test suite.
>>
>> If we could set the limit to a lower value than 8GB for testing then we
>> could at least check if the extended header is written, e.g. if ustar_size()
>> could be convinced to return 0 every time using a hidden command line
>> parameter or an environment variable or something better.
> 
> Yes, we could do that, though I think it loses most of the value of the
> test. We can check that if we hit an arbitrary value we generate the pax
> header, but I think what we _really_ care about is: did we generate an
> output that somebody else's tar implementation can handle.

I agree with the last point, but don't see how that diminishes the
value of such a test.  If we provide file sizes only through extended
headers (the normal header field being set to 0) and we can extract
files with correct sizes then tar must have interpreted those header
as intended, right?

(Or it just guessed the sizes by searching for the next header magic,
but such a fallback won't be accurate for files ending with NUL
characters due to NUL-padding, so we just have to add such a file.)

René


-- >8 --
Subject: archive-tar: test creation of pax extended size headers

---
The value 120 is magic; we need it to pass the tests.  That's
because prepare_header() is used for building extended header
records as well and we don't create extended headers for extended
headers (not sure if that would work anyway), so they simply
vanish when they're over the limit as their size field is set to
zero.

 archive-tar.c       | 7 ++++++-
 t/t5000-tar-tree.sh | 7 +++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/archive-tar.c b/archive-tar.c
index f53e61c..fbbc4cc 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -14,6 +14,7 @@ static char block[BLOCKSIZE];
 static unsigned long offset;
 
 static int tar_umask = 002;
+static unsigned long ustar_size_max = 077777777777UL;
 
 static int write_tar_filter_archive(const struct archiver *ar,
 				    struct archiver_args *args);
@@ -179,7 +180,7 @@ static size_t get_path_prefix(const char *path, size_t pathlen, size_t maxlen)
 
 static inline unsigned long ustar_size(uintmax_t size)
 {
-	if (size <= 077777777777UL)
+	if (size <= ustar_size_max)
 		return size;
 	else
 		return 0;
@@ -412,6 +413,10 @@ static int git_tar_config(const char *var, const char *value, void *cb)
 		}
 		return 0;
 	}
+	if (!strcmp(var, "tar.ustarsizemax")) {
+		ustar_size_max = git_config_ulong(var, value);
+		return 0;
+	}
 
 	return tar_filter_config(var, value, cb);
 }
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 4b68bba..03bb4c7 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -102,6 +102,7 @@ test_expect_success \
      echo long filename >a/four$hundred &&
      mkdir a/bin &&
      test-genrandom "frotz" 500000 >a/bin/sh &&
+     printf "\0\0\0" >>a/bin/sh &&
      printf "A\$Format:%s\$O" "$SUBSTFORMAT" >a/substfile1 &&
      printf "A not substituted O" >a/substfile2 &&
      if test_have_prereq SYMLINKS; then
@@ -157,6 +158,12 @@ test_expect_success 'git-archive --prefix=olde-' '
 
 check_tar with_olde-prefix olde-
 
+test_expect_success !TAR_NEEDS_PAX_FALLBACK 'pax extended size headers' '
+	git -c tar.ustarsizemax=120 archive HEAD >extended_size_header.tar
+'
+
+check_tar extended_size_header
+
 test_expect_success 'git archive on large files' '
     test_config core.bigfilethreshold 1 &&
     git archive HEAD >b3.tar &&
-- 
2.9.0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]