On Wed, Aug 8, 2018 at 6:50 PM Jeff King <peff@xxxxxxxx> wrote: > On Tue, Aug 07, 2018 at 04:21:31AM -0400, Eric Sunshine wrote: > > +# Swallowing here-docs with arbitrary tags requires a bit of finesse. When a > > +# line such as "cat <<EOF >out" is seen, the here-doc tag is moved to the front > > +# of the line enclosed in angle brackets as a sentinel, giving "<EOF>cat >out". > > Gross, but OK, as long as we would not get confused by a line that > actually started with <EOF> at the start. It can't get confused by such a line. There here-doc swallower prepends that when it starts the swallowing process and removes it add the end. Even if a line actually started with that, it would become "<EOF><EOF>cmd" while swallowing the here-doc, and be restored to "<EOF>cmd" at the end. Stripping the "<EOF>" is done non-greedily, so it wouldn't remove both of them. Likewise, non-greedy matching is used for pulling the "EOF" out of the "<...>" when trying to match against the terminating "EOF" line, so there can be no confusion. > > +/<<[ ]*[-\\]*[A-Z0-9_][A-Z0-9_]*/ { > > + s/^\(.*\)<<[ ]*[-\\]*\([A-Z0-9_][A-Z0-9_]*\)/<\2>\1<</ > > + s/[ ]*<<// > > Here-docs can use lowercase, too, though I'd personally frown on that > from a style perspective. Yeah, I was going with the tighter uppercase-only which Jonathan suggested[1], but I guess it wouldn't hurt to re-roll to allow lowercase too. [1]: https://public-inbox.org/git/20180730205914.GE156463@xxxxxxxxxxxxxxxxxxxxxxxxx/ > It looks like this doesn't catch: > > cat <<'EOF' > EOF > > either. I think we prefer the backslash style, but there are quite a few > <<-'EOF' hits. Is it covered somewhere else? No. I've gotten so used to \EOF in this codebase that it didn't occur to me to even think about 'EOF', but a re-roll could add that, as well.