Re: [PATCH 2/2] diff tests: rewrite flakyness-causing test "aid"

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 14 Apr 2021 08:22:43 +0200

On Tue, Apr 13 2021, Matheus Tavares Bernardino wrote:

> On Tue, Apr 13, 2021 at 4:01 PM Ævar Arnfjörð Bjarmason
> <avarab@xxxxxxxxx> wrote:
>>
>>
>> On Tue, Apr 13 2021, Matheus Tavares Bernardino wrote:
>>
>> > On Tue, Apr 13, 2021 at 9:31 AM Ævar Arnfjörð Bjarmason
>> > <avarab@xxxxxxxxx> wrote:
>> >>  t/t4013-diff-various.sh | 25 +++++++++++++++----------
>> >>  1 file changed, 15 insertions(+), 10 deletions(-)
>> >>
>> >> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
>> >> index 67f6411aff9..228ff100c61 100755
>> >> --- a/t/t4013-diff-various.sh
>> >> +++ b/t/t4013-diff-various.sh
>> >> @@ -200,10 +200,12 @@ do
>> >>         esac
>> >>         test=$(echo "$label" | sed -e 's|[/ ][/ ]*|_|g')
>> >>         pfx=$(printf "%04d" $test_count)
>> >> -       expect="$TEST_DIRECTORY/t4013/diff.$test"
>> >> +       expect_relative="t4013/diff.$test"
>> >> +       expect="$TEST_DIRECTORY/$expect_relative"
>> >>         actual="$pfx-diff.$test"
>> >>
>> >>         test_expect_$status "git $cmd # magic is ${magic:-(not used)}" '
>> >> +               test_when_finished "rm $actual" &&
>> >
>> > Nit: before these two patches, "$actual" was only removed when the
>> > test succeeded. So, in case of failure, the failed output files would
>> > still be there for debugging. It might be interesting to keep this
>> > behavior and only remove "$actual" at the end of the test.
>>
>> Either I'm missing something or you are, that's how test_when_finished
>> works.
>>
>> It's skipped under e.g. "--immediate --debug". See b586744a864 (test:
>> skip clean-up when running under --immediate mode, 2011-06-27)
>
> I was mostly thinking about the `artifacts` zip we get from our CI
> when a test fails. I find the final trash dir quite useful for some
> post-mortem analysis, especially to debug WIP tests that only fail
> occasionally or test failures on OSes I don't have quick access to.

Ah, yes that's a problem we should solve, but I think we should not put
off migration to test_when_finished because of that.

The whole reason we use it is to clean up the work area for the next
test.

Thus if we do:

    git something >expected &&
    test_cmp expected actual &&
    rm expected actual

And "git something" segfaults it's only dumb luck that subsequent tests
don't fail in non-obvious ways due to the files being left behind.

If we could rely on this in general we could make everything under
test_when_finished a noop, but I think you'll find that'll fail a lot of
things in the test suite if we do that.

But the problem you cite is legitimate, but we should solve it with
something like:

 1. If we fail tests 58 and 67 out of 100 we should/could run these
    again at the end under some more exhaustive debugging mode where
    we'd save away all the intermediate files.

 2. Or, and test_when_finished helps here, we could set $PATH when it
    runs to a directory with a custom "rm", which would save away the
    temporary files, expecting that we'll tar them up if any of the
    tests whose files we saved failed (we'd delete the other ones).

 3. Similarly, wrap "rm" in some "cat and rm" for common cases like
    actual/expected files, and improve the verbose output to some
    "verbose if fails".