Re: [PATCH 0/4] chainlint: improve annotated output

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Tue, 08 Nov 2022 23:17:14 +0100

On Tue, Nov 08 2022, Eric Sunshine via GitGitGadget wrote:

> When chainlint detects problems in a test, such as a broken &&-chain, it
> prints out the test with "?!FOO?!" annotations inserted at each problem
> location. However, rather than annotating the original test definition, it
> instead dumps out a parsed token representation of the test. Since it lacks
> comments, indentation, here-doc bodies, and so forth, this tokenized
> representation can be difficult for the test author to digest and relate
> back to the original test definition.
>
> An earlier patch series[1] improved the output somewhat by colorizing the
> "?!FOO?!" annotations and the "# chainlint:" lines, but the output can still
> be difficult to digest.
>
> This patch series further improves the output by instead making chainlint.pl
> annotate the original test definition rather than the parsed token stream,
> thus preserving indentation (and whitespace, in general), here-doc bodies,
> etc., which should make it easier for a test author to relate each problem
> back to the source.
>
> This series was inspired by usability comments from Peff[2] and Ævar[3] and
> a bit of discussion which followed[4][5].
>
> (Note to self: Add Ævar to nerd-snipe blacklist alongside Peff.)

Heh! It's great to see a follow-up to our discussion the other day, and
having the output verbatim & annotated looks much better, especially for
complex tests.

E.g. (taking one at random, after some grepping/skimming), ruining this one:

	diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
	index dcaab7265f5..c27539a773d 100755
	--- a/t/t6300-for-each-ref.sh
	+++ b/t/t6300-for-each-ref.sh
	@@ -1365,8 +1365,7 @@ test_expect_success 'for-each-ref --ignore-case works on multiple sort keys' '
	                do
	                        GIT_COMMITTER_EMAIL="$email@xxxxxxxxxxx" \
	                        git tag -m "tag $subject" icase-$(printf %02d $nr) &&
	-                       nr=$((nr+1))||
	-                       return 1
	+                       nr=$((nr+1))
	                done
	        done &&
	        git for-each-ref --ignore-case \

Would, before emit (correct, but a bit of a token-soup):

	$ ./t6300-for-each-ref.sh 
	# chainlint: ./t6300-for-each-ref.sh
	# chainlint: for-each-ref --ignore-case works on multiple sort keys

	nr=0 &&
	for email in a A b B
	do
	for subject in a A b B
	do
	GIT_COMMITTER_EMAIL="$email@xxxxxxxxxxx" git tag -m "tag $subject" icase-$(printf %02d $nr) &&
	nr=$((nr+1)) ?!LOOP?!
	done ?!LOOP?!
	done &&
	git for-each-ref --ignore-case --format="%(taggeremail) %(subject) %(refname)" --sort=refname --sort=subject --sort=taggeremail refs/tags/icase-* > actual &&
	cat > expect <<-EOF &&
	test_cmp expect actual
	error: bug in the test script: lint error (see '?!...!? annotations above)

But now it'll instead emit:

	$ ./t6300-for-each-ref.sh
	# chainlint: ./t6300-for-each-ref.sh
	# chainlint: for-each-ref --ignore-case works on multiple sort keys
	        # name refs numerically to avoid case-insensitive filesystem conflicts
	        nr=0 &&
	        for email in a A b B
	        do
	                for subject in a A b B
	                do
	                        GIT_COMMITTER_EMAIL="$email@xxxxxxxxxxx" \
	                        git tag -m "tag $subject" icase-$(printf %02d $nr) &&
	                        nr=$((nr+1)) ?!LOOP?!
	                done ?!LOOP?!
	        done &&
	        git for-each-ref --ignore-case \
	                --format="%(taggeremail) %(subject) %(refname)" \
	                --sort=refname \
	                --sort=subject \
	                --sort=taggeremail \
	                refs/tags/icase-* >actual &&
	        cat >expect <<-\EOF &&
	        <a@xxxxxxxxxxx> tag a refs/tags/icase-00
	        <a@xxxxxxxxxxx> tag A refs/tags/icase-01
	        <A@xxxxxxxxxxx> tag a refs/tags/icase-04
	        <A@xxxxxxxxxxx> tag A refs/tags/icase-05
	        <a@xxxxxxxxxxx> tag b refs/tags/icase-02
	        <a@xxxxxxxxxxx> tag B refs/tags/icase-03
	        <A@xxxxxxxxxxx> tag b refs/tags/icase-06
	        <A@xxxxxxxxxxx> tag B refs/tags/icase-07
	        <b@xxxxxxxxxxx> tag a refs/tags/icase-08
	        <b@xxxxxxxxxxx> tag A refs/tags/icase-09
	        <B@xxxxxxxxxxx> tag a refs/tags/icase-12
	        <B@xxxxxxxxxxx> tag A refs/tags/icase-13
	        <b@xxxxxxxxxxx> tag b refs/tags/icase-10
	        <b@xxxxxxxxxxx> tag B refs/tags/icase-11
	        <B@xxxxxxxxxxx> tag b refs/tags/icase-14
	        <B@xxxxxxxxxxx> tag B refs/tags/icase-15
	        EOF
	        test_cmp expect actual
	error: bug in the test script: lint error (see '?!...!? annotations above)

Which is so much better, i.e. as you're preserving the whitespace &
comments, and the "?!LOOP?!" is of course much easier to see with the
colored output.

I hadn't noticed before that the contents of here-docs was pruned, but
that made sense in the previous parser, but having the content.

Also, and I guess this is an attempt to evade your blacklist. I *did*
notice when playing around with this that if I now expand the "1 while"
loop here:

	my $s = do { local $/; <$fh> };
	close($fh);
	my $parser = ScriptParser->new(\$s);
	1 while $parser->parse_cmd();

To something that "follows along" with the parser it shouldn't be too
hard in the future to add line number annotations now. E.g. for
"#!/bin/sh\n" you'll emit a token like "\n", but the positions will be
0, 10.

But that's all for some hypothetical future, this is already much better
:)