On Tue, Jul 02, 2024 at 05:25:48PM -0400, Eric Sunshine wrote: > I created a white-room fix for this issue, as well, before taking a > look at your patch. The two implementations bear a strong similarity > which suggests that we agree upon the basic approach. > > My implementation, however, takes a more formal and paranoid stance. > Rather than squirreling away only the most-recently-seen heredoc body, > it stores each heredoc body along with the tag which introduced it. > This makes it robust against cases when multiple heredocs are > initiated on the same line (even within different parse contexts): > > cat <<EOFA && x=$(cat <<EOFB && > A body > EOFA > B body > EOFB > > Of course, that's not likely to come up in the context of > test_expect_* calls, but I prefer the added robustness over the more > lax approach. Yes, that's so much better than what I wrote. I didn't engage my brain very much when I read the in-code comments about multiple tags on the same line, and I thought you meant: cat <<FOO <<BAR this is foo FOO this is bar BAR which is...weird. It does "work" in the sense that "FOO" is a here-doc that should be skipped past. But it is not doing anything useful; cat sees only "this is bar" on stdin. So even for this case, the appending behavior that my patch does would not make sense. And of course for the actual useful thing, which you wrote above, appending is just nonsense. Recording and accessing by tag is the right thing. > > And then ScriptParser::parse_cmd() just has to recognize that any "<<" > > token isn't interesting, and that "-" means "read the here-doc". > > In my implementation, the `<<` token is "interesting" because the > heredoc tag is attached to it, and the tag is needed to pluck the > heredoc body from the set of saved bodies (since my implementation > doesn't assume most-recently-seen body is the correct one). Ah, OK. So it would probably not be that big of a deal to record a single bit for "this heredoc is interpolated". But until we have anything useful to do with that information, let's not worry about it for now. > > diff --git a/t/chainlint.pl b/t/chainlint.pl > > @@ -168,12 +168,15 @@ sub swallow_heredocs { > > if (pos($$b) > $start) { > > my $body = substr($$b, $start, pos($$b) - $start); > > + $self->{parser}->{heredoc} .= > > + substr($body, 0, length($body) - length($&)); > > $self->{lineno} += () = $body =~ /\n/sg; > > In my implementation, I use regex to strip off the ending tag before > storing the heredoc body. When I later looked at your implementation, > I noticed that you used substr() -- which seems preferable -- but > discovered that it strips too much in some cases. For instance, in > t0600, I saw that: Yeah, I was afraid of trying another regex, just because there are optional bits (like indentation) that we'd have to account for. Since $& contains the match already, that's all taken care of by the existing regex. From your follow-up, it sounds like the substr() approach does work (*phew*). -Peff