> On 29 Aug 2016, at 19:46, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > larsxschneider@xxxxxxxxx writes: > >> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh >> index 7b45136..34c8eb9 100755 >> --- a/t/t0021-conversion.sh >> +++ b/t/t0021-conversion.sh >> @@ -4,6 +4,15 @@ test_description='blob conversion via gitattributes' >> >> . ./test-lib.sh >> >> +if test_have_prereq EXPENSIVE >> +then >> + T0021_LARGE_FILE_SIZE=2048 >> + T0021_LARGISH_FILE_SIZE=100 >> +else >> + T0021_LARGE_FILE_SIZE=30 >> + T0021_LARGISH_FILE_SIZE=2 >> +fi > > Minor: do we need T0021_ prefix? What are you trying to avoid > collisions with? Not necessary. I'll remove the prefix. >> + git checkout -- test test.t test.i && >> + >> + mkdir generated-test-data && >> + for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE) >> + do >> + RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )" >> + ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )" > > In earlier iteration of loop with lower $i, what guarantees that > some bytes survive "tr -dc"? Nothing really, good catch! The seed "end" produces as first character always a "S" which would survive "tr -dc". However, that is clunky. I will always set "1" as first character in $RANDOM_STRING to mitigate the problem. > >> + # Generate 1MB of empty data and 100 bytes of random characters > > 100 bytes? It seems to me that you are giving 1MB and then $i-byte > or less (which sometimes can be zero) of random string. Outdated comment. Will fix! > >> + # printf "$(test-genrandom start $i)" >> + printf "%1048576d" 1 >>generated-test-data/large.file && >> + printf "$RANDOM_STRING" >>generated-test-data/large.file && >> + printf "%1048576d" 1 >>generated-test-data/large.file.rot13 && >> + printf "$ROT_RANDOM_STRING" >>generated-test-data/large.file.rot13 && >> + >> + if test $i = $T0021_LARGISH_FILE_SIZE >> + then >> + cat generated-test-data/large.file >generated-test-data/largish.file && >> + cat generated-test-data/large.file.rot13 >generated-test-data/largish.file.rot13 >> + fi >> + done > > This "now we are done with the loop, so copy them to the second > pair" needs to be in the loop? Shouldn't it come after 'done'? No, it does not need to be in the loop. I think I could do this after the loop instead: head -c $((1048576*$T0021_LARGISH_FILE_SIZE)) generated-test-data/large.file >generated-test-data/largish.file > I do not quite get the point of this complexity. You are using > exactly the same seed "end" every time, so in the first round you > have 1M of SP, letter '1', letter 'S' (from the genrandom), then > in the second round you have 1M of SP, letter '1', letter 'S' and > letter 'p' (the last two from the genrandom), and go on. Is it > significant for the purpose of your test that the cruft inserted > between the repetition of 1M of SP gets longer by one byte but they > all share the same prefix (e.g. "1S", "1Sp", "1SpZ", "1SpZT", > ... are what you insert between a large run of spaces)? The pktline packets have a constant size. If the cruft between 1M of SP has a constant size as well then the generated packets for the test data would repeat themselves. That's why I increased the length after every 1M of SP. However, I realized that this test complexity is not necessary. I'll simplify it in the next round. Thanks, Lars