Re: [PATCH 3/9] ci: make it easier to find failed tests' logs in the GitHub workflow

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 26 Jan 2022 00:48:03 +0100

On Mon, Jan 24 2022, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@xxxxxx>
>
> You currently have to know a lot of implementation details when
> investigating test failures in the CI runs. The first step is easy: the
> failed job is marked quite clearly, but when opening it, the failed step
> is expanded, which in our case is the one running
> `ci/run-build-and-tests.sh`. This step, most notably, only offers a
> high-level view of what went wrong: it prints the output of `prove`
> which merely tells the reader which test script failed.
>
> The actually interesting part is in the detailed log of said failed
> test script. But that log is shown in the CI run's step that runs
> `ci/print-test-failures.sh`. And that step is _not_ expanded in the web
> UI by default.
>
> Let's help the reader by showing the failed tests' detailed logs in the
> step that is expanded automatically, i.e. directly after the test suite
> failed.
>
> This also helps the situation where the _build_ failed and the
> `print-test-failures` step was executed under the assumption that the
> _test suite_ failed, and consequently failed to find any failed tests.
>
> An alternative way to implement this patch would be to source
> `ci/print-test-failures.sh` in the `handle_test_failures` function to
> show these logs. However, over the course of the next few commits, we
> want to introduce some grouping which would be harder to achieve that
> way (for example, we do want a leaner, and colored, preamble for each
> failed test script, and it would be trickier to accommodate the lack of
> nested groupings in GitHub workflows' output).

Is it really better to have the first thing you see in a failing job be
this level of detail?

To take the "before" demo job from your CL, if you click on a failing
job you'll currently end up with ~1600 lines of "prove" setup and
output, culminating in (the browser auto-scrolls to the end):

    [...]
    Test Summary Report
    -------------------
    t1092-sparse-checkout-compatibility.sh           (Wstat: 256 Tests: 53 Failed: 1)
      Failed test:  49
      Non-zero exit status: 1
    t3701-add-interactive.sh                         (Wstat: 0 Tests: 71 Failed: 0)
      TODO passed:   45, 47
    Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr  1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU)
    Result: FAIL

Is it ideal? No. But I think that folding the ci/print-test-failures.sh
output into that step makes it much worse. Now I'll be sent to the very
bottom of ~16000 lines (yes, that's an extra zero at the end) of output,
ending in:

    [...]
    + test_cmp expect sparse-checkout-out
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u expect sparse-checkout-out
    + test_cmp full-checkout-err sparse-checkout-err
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u full-checkout-err sparse-checkout-err
    + test_cmp full-checkout-err sparse-index-err
    + test 2 -ne 2
    + GIT_ALLOC_LIMIT=0 eval diff -u "$@"
    + diff -u full-checkout-err sparse-index-err

    ok 53 - checkout behaves oddly with df-conflict-2
    # failed 1 among 53 test(s)
    1..53

Now you'll need to scroll up or search just to see what test failed.

Usually when these fail I might only look at the failing test name (at
that point already knowing why it failed). I think it's a feature that
we only expand the verbose output later.

I realize that:

1) This isn't the exact output you emit in the post-image here, since you're not
actually using ci/print-test-failures.sh, but from eyeballing the script
it seems to do basically the same thing, i.e. it'll emit the full *.out
file.

B.t.w. why isn't this using ci/print-test-failures.sh. Your "an
alternative way" paragraph doesn't really explain it. Sure, it'll be
further tweaked later, but in the meantime do we have to re-invent
ci/print-test-failures.sh? Anyway...

2) The end-state at the end of this series looks somewhat different, but I think
that end-state shares the UX problem noted above, and to some extent
makes it worse.

That one has 28 thousand lines of output!

Now I know it's elided so you're only supposed to see a few screenfulls
of it, but at least in my browser that end-state is *very slow*, much
slower than the version that shows me the full ~16 thousand lines at
once.

Presumably it's doing some very expensive JavaScript/CSS behind the
scenes.

I mean so slow that when I press page up and down I can see 3-8 lines of
that folded output appear at once, then the next 3-8 lines etc. The
current output meanwhile (and this more verbose one) is
near-instant. This is in Firefox 91.4, if it matters.