Re: Expansion-lookalikes in heredoc delimiters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 15/03/2018 15:52, Herbert Xu wrote:
On Thu, Mar 15, 2018 at 12:41:10PM +0100, Harald van Dijk wrote:

It is if you want to do it the way POSIX specifies. You're adding a special
exception in the parser. I don't see how this approach can be extended to
handle the other examples in my mail:

I don't think it's exactly clear what POSIX says on this.

I don't think it's clear what POSIX means to say, but I do think it's clear what it does say.

   cat <<`one two`
   ok
   `one two`

Try this one:

	cat << ${
	${

bash/ksh93 both will get stuck forever until a right brace is
entered.

That's because POSIX specifies that after ${, everything up to the matching }, not including nested strings, expansions, etc., is part of the word. No exception is made when it spans multiple lines.

Another instance of this is

  if false; then
    echo ${
  fi
  }
  fi
  echo ok

This is accepted by bash, ksh, zsh and posh.

However, in this instance, I'm having trouble finding where, but IIRC, POSIX says or means to say that it's okay to flag this as an invalid parameter expansion at parse time even though the evaluation would otherwise be skipped.

While other shells all behave the way dash does.

All? At least two others don't: yash rejects ${ as invalid because of the missing parameter name. zsh agrees with bash and ksh that it should look for the closing }.

Considering the fact that even if you closed the brace after
the newline bash would still be stuck forever,

That's because bash doesn't ever match multi-line heredoc delimiters. Which is what POSIX technically requires:

"The here-document shall be treated as a single word that begins after the next <newline> and continues until there is a line containing only the delimiter and a <newline>, with no <blank> characters in between."

No single line will ever match a multi-line heredoc delimiter.

I think this
behaviour is suboptimal.  ksh93 seems to do the right thing though.

Yes, I agree that multi-line heredoc delimiters are useful.

Another interesting thing to try is

	cat << ${ foo
	${

Also you can look at the quotes:

	cat << "
	"

IOW it's not clear that "word" in this context necessarily follows
the same rules used during normal tokenisation.

Quotes are clear: as far as they go, this *must* follow the same rules used during normal tokenisation. Does any shell disagree?

  cat << "A ' B"
  ok 1
  A ' B
  echo ok 2

I expect this to print

  ok 1
  ok 2

and I'm not finding any shell that disagrees, not even dash. Do you have an example of a shell that sees only "A as the heredoc delimiter, or one that perhaps performs quote removal on the inner '?

For your example, does any shell, including dash, *not* treat this as a multi-line heredoc delimiter consisting of a newline character (which may or may not be matched by two blank lines, depending on the shell)?

Anyway, dash has had CHKEOFMARK since 2007 so this is just an
extension of existing behaviour.

This is definitely true: your patch, regardless of whether it matches POSIX's requirements, is consistent with how dash has behaved for a long time.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux