Re: [PATCH v4] tests: move t0009-prio-queue.sh to the new unit testing framework

Jeff King <peff@xxxxxxxx> · Mon, 22 Jan 2024 17:42:23 -0500

On Mon, Jan 22, 2024 at 11:09:39AM -0800, Junio C Hamano wrote:

> > +		case DUMP:
> > +			while ((peek = prio_queue_peek(&pq))) {
> > +				get = prio_queue_get(&pq);
> > +				if (!check(peek == get))
> > +					return;
> > +				if(!check_int(result[j++], ==, show(get)))
> > +					test_msg("failed at result[] index %d", j-1);
> > +			}
> > +			break;
> 
> OK.  So this one checks as we go.  I am not sure how easy to grok a
> breakage diagnosis with these giving the same message, without
> giving any context of the failure (e.g. when we are fed
> 
> 	INPUT  = 6 2 4 GET 5 3 GET GET 1 DUMP
> 	EXPECT = 2 3 4 1 5 6
> 
> and for some reason if the first GET did not yield expected 2 but
> gave us, say, 6, we only see "left: 2, right: 6" followed by "failed
> at result[] index 0", and nothing else.  
> 
> If it said something like "We pushed 6, 2, 4 and then did GET" to
> give the reader a bit more context, it would make it easier to see
> why we were complaining, i.e. expecting to see 2, instead getting 6.
> But perhaps that is too much to ask to this code?
> 
> I dunno.  Those who wanted to see an easier-to-diagnose output may
> have better ideas.

I don't have any bright ideas, but here are some dumb ones.

The big issue is that storing and manipulating data in C is tricky and
requires code to do correctly. And the whole idea of these tests is to
avoid having a ton of special code. That's why I actually think that the
existing strategy used in t0009 and elsewhere is pretty good. It pawns
off output handling and such to existing programs.

But if we really want to keep it all in C, some options are:

  1. We could log what is happening verbosely to stderr (or maybe stdout
     with a TAP "#" prefix?). That lets it go nowhere most of the time,
     but you can see it if you are investigating a failure (like the way
     our current "-v" works).

     Obviously by that same argument you can fire up a debugger to look
     at the problem more closely when you have a failure. But failures
     aren't always easy to reproduce. Sometimes they're racy, sometimes
     CI fails but local runs don't (I've found CI's use of "-v -x"
     especially helpful there), etc.

  2. We can rely on a dynamic-growth array of ints to store the result,
     and then a check_int_array() to compare/output them. That's custom
     code for the test harness, but in theory it's reusable across many
     tests.

  3. In the same direction, we could simply output to a strbuf or
     similar, and then compare the strbufs. We could even show a diff
     when they are not the same. ;) That is reimplementing what scripts
     like t0009 are doing already, but maybe people think there is other
     value in shoving it all into a single C program (as I've said, I
     remain a bit skeptical).

-Peff