On Thu, Mar 30, 2023 at 3:30 PM Jeff King <peff@xxxxxxxx> wrote: > From: Eric Sunshine <sunshine@xxxxxxxxxxxxxx> > > An unclosed here-doc in a test is a problem, because it silently gobbles > up any remaining commands. Since 99a64e4b73c (tests: lint for run-away > here-doc, 2017-03-22) we detect this by piggy-backing on the internal > chainlint checker in test-lib.sh. > > However, it would be nice to detect it in chainlint.pl, for a few > reasons: > > - the output from chainlint.pl is much nicer; it can show the exact > spot of the error, rather than a vague "somewhere in this test you > broke the &&-chain or had a bad here-doc" message. > > - the implementation in test-lib.sh runs for each test snippet. And > since it requires a subshell, the extra cost is small but not zero. > If chainlint.pl can reliably find the problem, we can optimize the > test-lib.sh code. > > The chainlint.pl code never intended to find here-doc problems. But > since it has to parse them anyway (to avoid reporting problems inside > here-docs), most of what we need is already there. We can detect the > problem when we fail to find the missing end-tag in swallow_heredocs(). > The extra change in scan_heredoc_tag() stores the location of the start > of the here-doc, which lets us mark it as the source of the error in the > output (see the new tests for examples). > > [jk: added commit message and tests] Thanks for packaging this up as a proper patch. > Signed-off-by: Eric Sunshine <sunshine@xxxxxxxxxxxxxx> > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > New in this iteration (thanks again, Eric!). > > I changed the error-tag from "HERE" to "UNCLOSED-HEREDOC". I don't think > there any syntactic limitations to worry about (it's just shown to the > user). The error-tag is plucked by the script at a couple places using a regex such as /\?![^?]+\?!/ which won't be tripped up by the longer and hyphenated name, so no problem there. > And it needs to be descriptive enough not to confuse users. > "HERE" is especially bad because my brain interprets it as "HERE there > is an error", which makes me say "Right. What error?". > > So "HEREDOC" would work, but there is no reason (aside from length) not > to err on the side of being descriptive. I had the same thought about HERE being potentially misleading, thus wavered between HERE and HEREDOC, but ultimately chose the shorter for length-similarity with the existing AMP and LOOP. That reminds me that, for some time now, I've been thinking that t/README ought to have a section explaining the meaning of AMP and LOOP (and HERE, if we went with that) since newcomers may not intuitively understand what such terse complaints mean. > diff --git a/t/chainlint.pl b/t/chainlint.pl > @@ -80,7 +80,8 @@ sub scan_heredoc_tag { > - push(@{$self->{heretags}}, $indented ? "\t$tag" : "$tag"); > + $$token[0] = $indented ? "\t$tag" : "$tag"; > + push(@{$self->{heretags}}, $token); > return "<<$indented$tag"; > @@ -169,10 +170,18 @@ sub swallow_heredocs { > - my $indent = $tag =~ s/^\t// ? '\\s*' : ''; > - $$b =~ /(?:\G|\n)$indent\Q$tag\E(?:\n|\z)/gc; > + my $indent = $$tag[0] =~ s/^\t// ? '\\s*' : ''; > + $$b =~ /(?:\G|\n)$indent\Q$$tag[0]\E(?:\n|\z)/gc; > + if (pos($$b) > $start) { > + my $body = substr($$b, $start, pos($$b) - $start); > + $self->{lineno} += () = $body =~ /\n/sg; > + next; > + } > + push(@{$self->{parser}->{problems}}, ['UNCLOSED-HEREDOC', $tag]); > + $$b =~ /(?:\G|\n).*\z/gc; # consume rest of input > my $body = substr($$b, $start, pos($$b) - $start); > $self->{lineno} += () = $body =~ /\n/sg; > + last; Oddly enough, these changes look fine to me.