Re: Expansion-lookalikes in heredoc delimiters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 15/03/2018 18:11, Herbert Xu wrote:
On Thu, Mar 15, 2018 at 05:29:27PM +0100, Harald van Dijk wrote:

That's because POSIX specifies that after ${, everything up to the matching
}, not including nested strings, expansions, etc., is part of the word. No
exception is made when it spans multiple lines.

Another instance of this is

   if false; then
     echo ${
   fi
   }
   fi
   echo ok

This is accepted by bash, ksh, zsh and posh.

Interestingly, ksh bombs out on this:

	echo ${
	fi
	}

So this behaviour is not exactly consistent.

It's perfectly consistent. It gets accepted at parse time, it only gets rejected at expansion time. That's how dash generally behaves as well:

  $ dash -c 'echo ${x^}'
  dash: 1: Bad substitution
  $ dash -c ': || echo ${x^}'
  $

Historically, as I understand it, ash would reject both of these, but you specifically modified dash to accept invalid expansions during parsing (which makes sense):


<https://git.kernel.org/pub/scm/utils/dash/dash.git/commit/?id=3df3edd13389ae768010bfacee5612346b413e38>

In any case, this substituion is invalid in all of these shells so
does it really matter?

Okay, it can be trivially modified to something that does work in other shells (even if it were actually executed), but gets rejected at parse time by dash:

  if false; then
    : ${$+
  }
  fi

It'd be nice if all shells used the same parse rules, so that scripts can detect dynamically which expansions are supported, but don't have to go through ugly eval commands to use them.

That's because bash doesn't ever match multi-line heredoc delimiters. Which
is what POSIX technically requires:

   "The here-document shall be treated as a single word that begins after the
next <newline> and continues until there is a line containing only the
delimiter and a <newline>, with no <blank> characters in between."

No single line will ever match a multi-line heredoc delimiter.

Sure.  But this is a quality of implementation issue.  If you're
never going to match the delimter you should probably emit an error
at the very start rather than at the EOF.

POSIX isn't clear on whether reaching EOF without seeing the heredoc delimiter is an error and shells disagree. dash silently accepts it, as do ksh, yash and zsh.

When EOF is an acceptable way to terminate a heredoc, a delimiter which never matches can be useful to force the complete remainder of the file to be treated as a heredoc.

Another interesting thing to try is

	cat << ${ foo
	${

Also you can look at the quotes:

	cat << "
	"

IOW it's not clear that "word" in this context necessarily follows
the same rules used during normal tokenisation.

Quotes are clear: as far as they go, this *must* follow the same rules used
during normal tokenisation. Does any shell disagree?

I was talking about multi-line quotes, which you conveniently dismissed :)

I got back to that :) I picked another example first because I thought it would be clearer using that that the normal tokenisation rules should be used.

For your example, does any shell, including dash, *not* treat this as a
multi-line heredoc delimiter consisting of a newline character (which may or
may not be matched by two blank lines, depending on the shell)?

Yes, almost every shell fails with the multi-line delimiter inside
double quotes.  Only dash and ksh93 seem to get it right.

posh accepts multi-line heredoc delimiters as well.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux