Re: [PATCH 7/9] tests: include detailed trace logs with --write-junit-xml upon failure

Johannes Schindelin <Johannes.Schindelin@xxxxxx> · Fri, 14 Sep 2018 14:04:17 -0500 (DST)

Hi Peff,

On Wed, 5 Sep 2018, Jeff King wrote:

> On Wed, Sep 05, 2018 at 02:38:34PM -0400, Eric Sunshine wrote:
> 
> > On Wed, Sep 5, 2018 at 8:39 AM Johannes Schindelin
> > <Johannes.Schindelin@xxxxxx> wrote:
> > > So let's hear some ideas how to improve the situation, m'kay?  Just
> > > as a reminder, this is the problem I want to solve: I want to run
> > > the tests in a light-weight manner, with minimal output, and only in
> > > case of an error do I want to crank up the verbosity. Instead of
> > > wasting most of the effort to log everything and then throwing it
> > > away in most of the common cases, I suggest to re-run the entire
> > > test.
> > 
> > What about the very different approach of capturing the full "verbose"
> > output the executed tests in addition to whatever is actually output
> > to the terminal? If a test fails, then (and only then) you can insert
> > the captured verbose output into the JUnit XML file. This way (if we
> > always have the full verbose output at hand), you don't need to re-run
> > the test at all.
> > 
> > I've cc:'d Peff and Jonathan since I believe they are more familiar
> > with how all the capturing / output-redirection works in the test
> > suite.
> 
> I don't think there's much to know beyond what you wrote. The
> "--verbose" case does not really cost any more than the non-verbose one,
> because the only difference is whether output goes to a file versus
> /dev/null. But the commands all call write() regardless.

This really only holds true on systems where redirected output is
essentially a no-op.

For the MSYS2-emulated case, this is incorrect. And as that case is
already suffering, time-wise, I would be loathe to pile more onto it.

> For --verbose-log, it does of course cost a little extra to run one
> `tee` per script, and to write a few kb of logs. I doubt those are
> measurable versus the rest of a script run, but I'm happy to be
> disproven by numbers.

I tried, and failed, to quantify this properly. Essentially because a
single test run takes over 1h15m, so I tried to do it in the cloud, but
those timings are too variable for anything approaching an accurate
measurement.

> There are some gymnastics done to re-exec the test script with the same
> shell, but AFAIK those are robust and don't cost a lot (again, one extra
> process per script run).

It *is* actually a bit worse here (but only in the case of failures, which
should be the minority of the cases): imagine an expensive test like t0027
that takes something like 14 minutes to run, and then a test failure near
the end: re-running with verbose output will cost another 14 minutes.

However, the benefit of allowing to pass flakey tests outweighs the costs
here, if you ask me.

> I'm not overly concerned about the cost of re-running a test, since the
> idea is that failed tests should be rare. I would be a little worried
> about flaky tests mysteriously righting themselves on the second run (so
> you know a failure happened, but you have no good output to describe
> it).

This is actually a benefit.

Pretty much all the times I can remember a test being flakey (with the one
notable exception of the O_APPEND issue with GIT_TRACE), the *tests* were
buggy.

Plus, it is less concerning if a test fails occasionally vs a test that
fails all the time.

My plan is to indicate the outcome of "Passed upon rerun" in the test
output eventually, as soon as there is server-side support for it.

(Sidenote: there is technically already server-side support for it, but we
would have to generate Visual Studio-style test output, and I had the
hunch that some die-hard Linuxers among the core Git developers would take
objection to that.)

> I do agree that a test_atexit() cleanup would make a lot more sense than
> what's in the original patch. And that's nicer than the exit trap we're
> using already, because you may note that each caller has to manually
> restore the original 'die' handler that test-lib.sh installs.

Okay, then. I will work on this.

Ciao,
Dscho

> That would also help with bitrot. If this funky cleanup case only causes
> problems with junit output, then other people are likely to forget to do
> it, and the mess falls onto the junit folks (i.e., Dscho). But if the
> tests start _relying_ on test_atexit() being called (i.e., don't call
> stop_git_daemon manually, but assume that the atexit handler does so),
> then the responsible authors are a lot more likely to notice and fix it
> early.
> 
> -Peff
>