Consider this:
cat <<`bad`
`bad`
As far as I can tell, this is technically valid, supposed to print
nothing, and accepted in most other shells.
POSIX Token Recognition says `bad` is to be recognised as a single token
of type word, and any word can act as a heredoc delimiter. POSIX
Here-Document says only quote removal is performed on the word.
However, dash does not always preserve the original spelling of the
word. That's what's going on here. Because dash hasn't preserved the
`bad` spelling (it's been turned into CTLBACKQ), the check for the end
of the heredoc doesn't pick it up, and it instead gets taken as a
command substitution.
When an actual 0x84 byte is seen in the input, *that* gets taken as the
heredoc delimiter:
dash -c "tail -n1 <<\`:\`
`printf \\\204`
ok
\`:\`"
In a locale in which 0x84 is a valid character (since dash doesn't
support locales, that's easy, it's always valid), it's supposed to print
"ok". dash instead interprets the second line as the end of the heredoc
and subsequently issues an error message when it interprets "ok" on line
3 as a command to execute.
This is pretty clearly a case that no serious script is ever going to
encounter, not to mention one that many shells don't even attempt to
support, at least not completely, so I don't think this is a real
problem. I'm mentioning it anyway because I was trying to come up with a
few more test cases for the parser, and I think it's good to have a
record not only of what worked, what has been made to work, and what got
broken, but also of what's never going to be work.
Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html