On Fri, Jun 24, 2016 at 11:56:19AM -0700, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > The ustar format only has room for 11 (or 12, depending on > > some implementations) octal digits for the size and mtime of > > each file. After this, we have to add pax extended headers > > to specify the real data, and git does not yet know how to > > do so. > > I am not a native speaker but "After" above made me hiccup. I think > I am correct to understand that it means "after passing this limit", > aka "to represent files bigger or newer than these", but still it > felt somewhat strange. Yeah, I agree that it reads badly. I'm not sure what I was thinking. I'll tweak it in the re-roll. > > +# See if our system tar can handle a tar file with huge sizes and dates far in > > +# the future, and that we can actually parse its output. > > +# > > +# The reference file was generated by GNU tar, and the magic time and size are > > +# both octal 01000000000001, which overflows normal ustar fields. > > +# > > +# When parsing, we'll pull out only the year from the date; that > > +# avoids any question of timezones impacting the result. > > ... as long as the month-day part is not close to the year boundary. > So this explanation is insuffucient to convince the reader that > "that avoids any question" is correct, without saying that it is in > August of year 4147. I thought that part didn't need to be said, but I can say it (technically we can include the month, too, but I don't think that level of accuracy is really important for these tests). > > +tar_info () { > > + "$TAR" tvf "$1" | awk '{print $3 " " $4}' | cut -d- -f1 > > +} > > A blank after the shell function to make it easier to see the > boundary. I was intentionally trying to couple it with prereq below, as the comment describes both of them. > Seeing an awk piped into cut always makes me want to suggest a > single sed/awk/perl invocation. I want the auto-splitting of awk, but then to auto-split the result using a different delimiter. Is there a not-painful way to do that in awk? I could certainly come up with a regex to do it in sed, but I wanted to keep the parsing as liberal and generic as possible. Certainly I could do it in perl, but I had the general impression that we prefer to keep the dependency on perl to a minimum. Maybe it doesn't matter. > > +# We expect git to die with SIGPIPE here (otherwise we > > +# would generate the whole 64GB). > > +test_expect_failure BUNZIP 'generate tar with huge size' ' > > + { > > + git archive HEAD > > + echo $? >exit-code > > + } | head -c 4096 >huge.tar && > > + echo 141 >expect && > > + test_cmp expect exit-code > > +' > > "head -c" is GNU-ism, isn't it? You're right; for some reason I thought it was in POSIX. We do have a couple instances of it, but they are all in the valgrind setup code (which I guess most people don't ever run). > "dd bs=1 count=4096" is hopefully more portable. Hmm. I always wonder whether dd is actually very portable, but we do use it already, at least. Perhaps the perl monstrosity in t9300 could be replaced with that, too. > ksh signal death you already know about. I wonder if we want to > expose something like list_contains as a friend of test_cmp. > > list_contains 141,269 $(cat exit-code) I think we would want something more like: test_signal_match 13 $(cat exit-code) Each call site should not have to know about every signal convention (and in your example, the magic "3" of Windows is left out). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html