Re: [PATCH 1/2] test-lib: allow test snippets as here-docs

Jeff King <peff@xxxxxxxx> · Sat, 6 Jul 2024 01:31:05 -0400

On Tue, Jul 02, 2024 at 05:25:48PM -0400, Eric Sunshine wrote:

> I created a white-room fix for this issue, as well, before taking a
> look at your patch. The two implementations bear a strong similarity
> which suggests that we agree upon the basic approach.
> 
> My implementation, however, takes a more formal and paranoid stance.
> Rather than squirreling away only the most-recently-seen heredoc body,
> it stores each heredoc body along with the tag which introduced it.
> This makes it robust against cases when multiple heredocs are
> initiated on the same line (even within different parse contexts):
> 
>     cat <<EOFA && x=$(cat <<EOFB &&
>     A body
>     EOFA
>     B body
>     EOFB
> 
> Of course, that's not likely to come up in the context of
> test_expect_* calls, but I prefer the added robustness over the more
> lax approach.

Yes, that's so much better than what I wrote. I didn't engage my brain
very much when I read the in-code comments about multiple tags on the
same line, and I thought you meant:

  cat <<FOO <<BAR
  this is foo
  FOO
  this is bar
  BAR

which is...weird. It does "work" in the sense that "FOO" is a here-doc
that should be skipped past. But it is not doing anything useful; cat
sees only "this is bar" on stdin. So even for this case, the appending
behavior that my patch does would not make sense.

And of course for the actual useful thing, which you wrote above,
appending is just nonsense. Recording and accessing by tag is the right
thing.

> > And then ScriptParser::parse_cmd() just has to recognize that any "<<"
> > token isn't interesting, and that "-" means "read the here-doc".
> 
> In my implementation, the `<<` token is "interesting" because the
> heredoc tag is attached to it, and the tag is needed to pluck the
> heredoc body from the set of saved bodies (since my implementation
> doesn't assume most-recently-seen body is the correct one).

Ah, OK. So it would probably not be that big of a deal to record a
single bit for "this heredoc is interpolated". But until we have
anything useful to do with that information, let's not worry about it
for now.

> > diff --git a/t/chainlint.pl b/t/chainlint.pl
> > @@ -168,12 +168,15 @@ sub swallow_heredocs {
> >                 if (pos($$b) > $start) {
> >                         my $body = substr($$b, $start, pos($$b) - $start);
> > +                       $self->{parser}->{heredoc} .=
> > +                               substr($body, 0, length($body) - length($&));
> >                         $self->{lineno} += () = $body =~ /\n/sg;
> 
> In my implementation, I use regex to strip off the ending tag before
> storing the heredoc body. When I later looked at your implementation,
> I noticed that you used substr() -- which seems preferable -- but
> discovered that it strips too much in some cases. For instance, in
> t0600, I saw that:

Yeah, I was afraid of trying another regex, just because there are
optional bits (like indentation) that we'd have to account for. Since $&
contains the match already, that's all taken care of by the existing
regex. From your follow-up, it sounds like the substr() approach does
work (*phew*).

-Peff