Re: [PATCH 1/2] test-lib: allow test snippets as here-docs

Eric Sunshine <sunshine@xxxxxxxxxxxxxx> · Tue, 2 Jul 2024 17:25:48 -0400

On Mon, Jul 1, 2024 at 8:51 PM Jeff King <peff@xxxxxxxx> wrote:
> On Mon, Jul 01, 2024 at 06:45:19PM -0400, Eric Sunshine wrote:
> > We lose `chainlint` functionality for test bodies specified in this manner.
>
> Hmm. The patch below seems to work on a simple test.
>
> The lexer stuffs the heredoc into a special variable. Which at first
> glance feels like a hack versus returning it from the token stream, but
> the contents really _aren't_ part of that stream. They're a separate
> magic thing that is found on the stdin of whatever command the tokens
> represent.

I created a white-room fix for this issue, as well, before taking a
look at your patch. The two implementations bear a strong similarity
which suggests that we agree upon the basic approach.

My implementation, however, takes a more formal and paranoid stance.
Rather than squirreling away only the most-recently-seen heredoc body,
it stores each heredoc body along with the tag which introduced it.
This makes it robust against cases when multiple heredocs are
initiated on the same line (even within different parse contexts):

    cat <<EOFA && x=$(cat <<EOFB &&
    A body
    EOFA
    B body
    EOFB

Of course, that's not likely to come up in the context of
test_expect_* calls, but I prefer the added robustness over the more
lax approach.

> And then ScriptParser::parse_cmd() just has to recognize that any "<<"
> token isn't interesting, and that "-" means "read the here-doc".

In my implementation, the `<<` token is "interesting" because the
heredoc tag is attached to it, and the tag is needed to pluck the
heredoc body from the set of saved bodies (since my implementation
doesn't assume most-recently-seen body is the correct one).

> Obviously we'd want to add to the chainlint tests here. It looks like
> the current test infrastructure is focused on evaluating snippets, with
> the test_expect_success part already handled.

Yes, the "snippet" approach is a throwback to the old chainlint.sed
implementation when there wasn't any actual parsing going on. As you
note, this unfortunately does not allow for testing parsing-related
aspects of the implementation, which is a limitation I most definitely
felt when chainlint.pl was implemented. It probably would be a good
idea to update the infrastructure to allow for more broad testing but
that doesn't need to be part of the changes being discussed here.

> diff --git a/t/chainlint.pl b/t/chainlint.pl
> @@ -168,12 +168,15 @@ sub swallow_heredocs {
>                 if (pos($$b) > $start) {
>                         my $body = substr($$b, $start, pos($$b) - $start);
> +                       $self->{parser}->{heredoc} .=
> +                               substr($body, 0, length($body) - length($&));
>                         $self->{lineno} += () = $body =~ /\n/sg;

In my implementation, I use regex to strip off the ending tag before
storing the heredoc body. When I later looked at your implementation,
I noticed that you used substr() -- which seems preferable -- but
discovered that it strips too much in some cases. For instance, in
t0600, I saw that:

    cat >expected <<-\EOF &&
    HEAD
    PSEUDO_WT_HEAD
    refs/bisect/wt-random
    refs/heads/main
    refs/heads/wt-main
    EOF

was getting stripped down to:

    HEAD
    PSEUDO_WT_HEAD
    refs/bisect/wt-random
    refs/heads/main
    refs/heads/wt-ma{{missing-nl}}

It wasn't immediately obvious why this was happening, though I didn't
spend a lot of time trying to debug it.

Although I think my implementation is complete, I haven't submitted it
yet because I discovered that the changes you made to t1404 are
triggering false-positives:

    # chainlint: t1404-update-ref-errors.sh
    # chainlint: existing loose ref is a simple prefix of new
    120 prefix=refs/1l &&
    121 test_update_rejected a c e false b c/x d \
    122   '$prefix/c' exists; ?!AMP?! cannot create '$prefix/c/x'

Unfortunately, I ran out of time, thus haven't tracked down this
problem yet. I also haven't tested your implementation yet to
determine if this is due to a change I made or due to a deeper
existing issue with chainlint.pl.