Re: [PATCH] test-lib: add ability to cap the runtime of tests

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Sun, 4 Jun 2017 09:29:22 +0200

On Sun, Jun 4, 2017 at 2:31 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:
>
>> Speeding up the test suite by simply cataloging and skipping tests
>> that take longer than N seconds is a hassle to maintain, and entirely
>> skips some tests which would be nice to at least partially run,
>> e.g. instead of entirely skipping t3404-rebase-interactive.sh we can
>> run it for N seconds and get at least some "git rebase -i" test
>> coverage in a fast test run.
>
> I'd be more supportive to the former approach in the longer run for
> two reasons.
>
> Is it even safe to stop a test in the middle?  Won't we leave
> leftover server processes, for example?
>
>     I see start_httpd at least sets up "trap" to call stop_httpd
>     when the shell exits, so HTTP testing via lib-httpd.sh may be
>     safe.  I do not know about other network-y tests, though.

When this flag is in effect and you run into the timeout the code is
semantically equivalent to not running subsequent test_expect_*
blocks, things like the trap in lib-httpd.sh will still run, so will
test_when_finished.

Unless we have some test killing a daemon in a test_expect_success
block later in the test this'll work as intended.

> Granted, when a test fails, we already have the same problem, but
> then we'd go in and investigate, and the first thing we notice would
> be that the old leftover server instance is holding onto the port to
> prevent the attempt to re-run the test from running, which then we'd
> kill.  But with this option, the user is not even made aware of
> tests being killed in the middle.
>
>> While running with a timeout of 10 seconds cuts the runtime in half,
>> over 92% of the tests are still run. The test coverage is higher than
>> that number indicates, just taking into account the many similar tests
>> t0027-auto-crlf.sh runs brings it up to 95%.
>
> I certainly understand that but in the longer term, I'd prefer the
> approach to call out an overly large test.  That will hopefully
> motivate us to split it (or speed up the thing) to help folks on
> many-core machines.

The reason I didn't document this in t/README was because I thought it
made sense to have this as a mostly hidden feature that end users
wouldn't be tempted to fiddle with, but would be useful to someone
doing git development.

Realistically I'm going to submit this patch, I'm not going to take
the much bigger project of refactoring the entire test suite so that
no test runs under N second, and of course any such refactoring can
only aim for a fixed instead of dynamic N.

The point of this change is that I can replace running e.g. "prove
t[0-9]*{grep,log}*.sh" with just running the full test suite every
time, since 30s is noticeably slow during regular hacking but once
it's down to 15s it's perceptively fast enough.

Reading between the lines in your reply, I think you're afraid that
regular users just testing git out will start using this, as opposed
to power user developers who understand the trade-offs. I think that's
mostly mitigated by not documenting it in t/README, but I could amend
the patch to add some scary commend to test-lib.sh as well.

> I am afraid that the proposed change will disincentivize that by
> sweeping the problematic ones under the rug.  Perhaps you can
> collect what tests are terminated in the middle because they run for
> too long and show the list of them at the end, or something?

This change incentivizes  me to be regularly running a larger % of the
full test suite.

Collecting the skipped ones is easy enough to do with a grep + for
loop, so I don't think it's worth making the implementation more
complex to occasionally answer the question of how many tests were
skipped due to running into the timeout:

$ rm .prove; for t in 20 10 5 1; do printf "%s\t" $t && (time
GIT_TEST_TIMEOUT=$t prove -j$(parallel --number-of-cores)
--state=slow,save -v t[0-9]*.sh) 2>&1 | grep -c "Exceeded
GIT_TEST_TIMEOUT"; done
rm: cannot remove ‘.prove’: No such file or directory
20      4
10      36
5       80
1       509

Of course that gives you "how many tests had skipped tests", now how
many test_expect_* blocks were skipped. An earlier WIP version of this
did the former, but e.g. running the rest of t0027-auto-crlf.sh took
many seconds just do spew out hundreds/thousands of lines in a shell
loop emitting "skip" lines, so I went with the to_skip=all
implementation.

> Also, I thought that it was a no-no to say "to_skil=all" with
> skipped-reason in the middle of a test when the test is run under
> prove?

TAP has two main modes of operation, you can either declare that
you're going to run N tests in advance and then you must run N, this
makes prove report progress on your tests as they run.

Or you can just run in a mode where you stream out however many tests
you're going to run as you go along, and then print "1..NUM_TESTS" at
the end.

We use the latter, so we can abort the entire test suite at any time
with test_done, that's what this change does.

> Oh, by the way, is "date +%s" even portable?  I thought not.

The lib-git-p4.sh lib says not, and shells out to python's time() is a
workaround, I could replace this with perl -e 'print time', but
thought it wasn't worth bothering with for an obscure optional feature
like this.

Since 6a9d16a0a8 ("filter-branch: add passed/remaining seconds on
progress", 2015-09-07) git-filter-branch relies on `date +%s`.

I suspect Solaris users are just setting a GNU/updated toolpath in
their $PATH, and worrying about this isn't worth bothering with,
especially for this sort of thing.