Re: [PATCH] Make test output coloring more intuitive

Adam Spiers <git@xxxxxxxxxxxxxx> · Tue, 18 Sep 2012 22:36:17 +0100

On Mon, Sep 17, 2012 at 01:50:39PM -0700, Junio C Hamano wrote:
> Adam Spiers <git@xxxxxxxxxxxxxx> writes:
> 
> > 1. Change the color of individual known breakages from bold green to
> >    bold yellow.  This seems more appropriate when considering the
> >    universal traffic lights coloring scheme, where green conveys the
> >    impression that everything's OK, and amber that something's not
> >    quite right.
> >
> > 2. Likewise, change the color of the summarized total number of known
> >    breakages from bold red to bold yellow to be less alarmist and more
> >    consistent with the above.
> >
> > 3. Change color of unexpectedly fixed known breakages to bold red.  An
> >    unexpectedly passing test indicates that the test is wrong or the
> >    semantics of the code being tested have changed.  Either way this
> >    is an error which is arguably as bad as a failing test, and as such
> >    is now counted in the totals too.
> 
> I agree with Peff's comments.
> 
> The point #3 above wants to be a separate patch;

OK, re-rolling this.

> we may even want to consider a follow-up change to add an option to
> make a "test that is expected to fail did not fail" case a failure.

Sounds like a nice idea.

> >  test_known_broken_ok_ () {
> >  	test_fixed=$(($test_fixed+1))
> > -	say_color "" "ok $test_count - $@ # TODO known breakage"
> > +	test_broken=$(($test_broken+1))
> > +	say_color error "ok $test_count - $@ # TODO known breakage vanished"
> >  }
> 
> Also I wonder if this is still a "TODO".

Hah, I should trust my first instincts more; my first version of the
patch dropped the "TODO", but then I put it back in because I thought
people would object :-)

> "# TODO fixed known breakage", meaning that it is something that
> must be looked at by whoever happened to have fixed the known
> breakage by accident, might be a better wording.

I would challenge the assumption that the test passed because someone
deliberately fixed the code it was testing.  Whilst this is a probable
scenario (e.g. if they forgot to adjust the test to expect success
rather than failure), it's not the only one.  Tests can be brittle and
depend on external factors which can change unexpectedly.  For
example, I noticed that the final 'general options plus command' test
in t9902-completion.sh is currently dependent on the contents of the
current $PATH.  This is because I have a shell-script called
`git-check-email' on mine, so `check-email' becomes an extra
subcommand completion possibility for the prefix `check'.  Almost
every other developer running this test would not encounter the same
failure.  It's not a huge stretch to imagine a similar corner case
which could unexpectedly cause a test to pass when failure was
expected.

Therefore I would suggest a wording which avoids implying that
something was deliberately fixed.  That was my reasoning behind
introducing the word "vanished" to the output.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html