Re: [PATCH] test-lib: make '--stress' more bisect-friendly

SZEDER Gábor <szeder.dev@xxxxxxxxx> · Fri, 8 Feb 2019 19:23:19 +0100

On Fri, Feb 08, 2019 at 11:47:33AM -0500, Jeff King wrote:
> On Fri, Feb 08, 2019 at 12:50:45PM +0100, SZEDER Gábor wrote:
> 
> >   - Make it exit with failure if a failure is found.
> > 
> >   - Add the '--stress-limit=<N>' option to repeat the test script
> >     at most N times in each of the parallel jobs, and exit with
> >     success when the limit is reached.
> > [...]
> > 
> > This is a case when an external stress script works better, as it can
> > easily check commits in the past...  if someone has such a script,
> > that is.
> 
> Heh, I literally just implemented this kind of max-count in my own
> "stress" script[1] to handle this recent t0025 testing. So certainly I
> think it is a good idea.
> 
> Picking an <N> is tough. Too low and you get a false negative, too high
> and you can wait forever, especially if the script is long. But I don't
> think there's any real way to auto-scale it, except by seeing a few of
> the failing cases and watching how long they take.

So far I've chosen <N> like this: run the test script with --stress
3-5 times to trigger the failure, take the highest repetition count
that was necessary for the failure, multiply it by 4-6 to get a round
number, and that's a good ballpark for <N>.  And once bisect came up
with the suspect commit, I double checked it by letting the test
script run with --stress on its parent commit for at least 5-10x <N>
repetitions.

Anyway, I doubt that auto-scaling <N> is worth the effort.

> >  t/README      |  5 +++++
> >  t/test-lib.sh | 18 ++++++++++++++++--
> >  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> Patch looks good. A few observations:
> 
> > @@ -237,8 +248,10 @@ then
> >  				exit 1
> >  			' TERM INT
> >  
> > -			cnt=0
> > -			while ! test -e "$stressfail"
> > +			cnt=1
> > +			while ! test -e "$stressfail" &&
> > +			      { test -z "$stress_limit" ||
> > +				test $cnt -le $stress_limit ; }
> >  			do
> >  				$TEST_SHELL_PATH "$0" "$@" >"$TEST_RESULTS_BASE.stress-$job_nr.out" 2>&1 &
> >  				test_pid=$!
> 
> You switch to 1-indexing the counts here. I think that makes sense,
> since otherwise --stress-limit=300 would end at "1.299", etc.

Yeah, that's exactly why I did it.

> 
> > @@ -261,6 +274,7 @@ then
> >  
> >  	if test -f "$stressfail"
> >  	then
> > +		stress_exit=1
> >  		echo "Log(s) of failed test run(s):"
> >  		for failed_job_nr in $(sort -n "$stressfail")
> >  		do
> 
> I think I'd argue that this missing stress_exit is a bug in the original
> script,

Well, yes, indeed.

Though being able to trigger an elusive test failure is a success in
my book ;)

> and somewhat orthogonal to the limit counter. But I don't think
> it's worth the trouble to split it out (and certainly the theme of "now
> you can run this via bisect" unifies the two changes).
> 
> -Peff