Re: [PATCH] Increase timeout for tests and syntax-check

Daniel Henrique Barboza <danielhb413@xxxxxxxxx> · Fri, 29 Jan 2021 10:33:32 -0300

On 1/29/21 10:04 AM, Michal Privoznik wrote:
On 1/29/21 1:30 PM, Daniel Henrique Barboza wrote:

On 1/27/21 2:59 PM, Michal Privoznik wrote:
Since we've switched to meson our tests run with a timeout (meson
uses 30 seconds as the default). However, not every machine that
builds libvirt is fast enough to run every test under 30 seconds
(each test binary has its own timeout, but still). For instance
when building a package for distro on a farm that's under load.
Or on a generally slow ARM hardware. While each developer can
tune their command line for building by adding
--timeout-multiplier=10, this is hard to do for aforementioned
build farms.

It's time to admit that not everybody has the latest, top shelf
CPU and increase the timeout.

Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx>
---

This sure will help these build farms environments, but what about the cases
where an actual timeout means that there is something wrong with the code?
E.g. commits 46d88d8dba56 and 2ba0b7497ce7 were only possible because I was
seeing tests timing out in Power hosts when the 30 sec timeout was being
enforced.

A 120 second default timeout for the majority of the test cases is a long time.
virschematest in this laptop I use takes 2.5 sec to complete. If I do something
wrong in the code and now the test is now 4 times slower (10 sec) I will not be
able to detect it (I'll need to start keeping track or something). 

With 30 second timeout you won't detect that either. Using timeout as an indicator of test failure is wrong IMO. And if I were not lazy and fixed 'check-access' test suite then we would see instantly what tests are accessing paths in the host (=> depend on host configuration).

30 sec is too long for most tests :)

Let's put it this way: I don't think most of us keeps track of how much a
certain test takes to complete during our dev, and the 30 sec timeout is
a marker to see if we messed up or not. A 120 timeout is too extreme for
my dev env (and I believe most if not all of us can run the test suit without
any problems during development).

All that said, I just ran 'ninja -C build test' and verified that we don't provide
a total run time for all tests. We provide the time taken for all 306 tests, but
not a total. If we change the script to output the total time taken to run all
tests in the test suit, in a successful run (i.e. no failed tests), then I wouldn't
mind the timeout increase. I can run the test suit in master, get the total time
taken, do some coding, run again, compare the new total. This comparison will give
me a hint of whether something went too wrong, then I can compare the numbers myself
to see what happened. In this case I wouldn't mind removing all test timeouts or
increasing the timeout to help the distros or what have you.

Thanks,

DHB

You'll have to
run the test suit on your RasPi 2B to see that something went wrong because the
timeout is better tuned to your RasPI than this laptop, but then the code is already
upstream.

So should we make timeouts shorter then? Why is 30 seconds sweet spot?

And the tests will get more complex and will naturally take longer to complete.
Eventually this timeout might no be enough. Increase the timeout again?

Sure, why not? We adapt to newer gcc/clang/$whatever, why not timeout?

Michal