On 02/03/2018 08:49, Herbert Xu wrote:
On Thu, Mar 01, 2018 at 08:24:22PM +0100, Harald van Dijk wrote:
On 01/03/2018 00:04, Harald van Dijk wrote:
$ bash -c 'x=yz; echo "${x#'"'y'"'}"'
z
$ dash -c 'x=yz; echo "${x#'"'y'"'}"'
yz
(That is, they are executing x=yz; echo "${x#'y'}".)
POSIX says that in "${var#pattern}" (and the same for ##, % and %%), the
pattern is considered unquoted regardless of the outer quotation marks.
Because of that, the single quote characters should not be taken
literally, but should be taken as quoting the y. ksh, posh and zsh agree
with bash.
Unfortunately, this causes another problem with all of the backslash
approaches so far:
x='\\\\'; printf "%s\n" "${x#'\\\\'}"
This should print a blank line. (bash, ksh, posh and zsh agree.)
Here, dash's parser stores '$\$\', where $ is a control character. preglob
would need to turn this into \\\\\\\\. The problem is again that preglob
cannot increase the string length. Perhaps the parser needs to store this as
'$\$\$\$\', $ being either CTLESC or that new CTLBACK? Either way, it
requires some more invasive changes.
These are different issues. dash's parser currently does not
understand nested quoting in patterns at all. That is, if your
parameter expansion are within double quotes, then dash at the
parser level will consider the pattern to be double-quoted. Thus
any nested single-quotes will be literals instead of actual quotes.
That's the same thing though. The problem with the backslashes is also
that dash sees them as double-quoted when they should be seen as
unquoted, and the approach taken in commit
7cfd8be0dc83342b4a71f3a8e5b7efab4670e50c that lasts to this day was
specifically to *not* fix this in the parser, but to simply have the
parser record enough information so that quote status can be determined
and patched up during expansion. It's just that in the case of single
quotes, expansion was never modified to recognise them. Thinking some
more, I don't think the parser actually records enough information to
let that work.
If we fix this in the parser then everything should just work.
Right, that's the approach FreeBSD sh has taken that I referred to in my
message from Feb 18, that I'd personally prefer as well. It basically
involves reverting 7cfd8be0dc83342b4a71f3a8e5b7efab4670e50c, setting
syntax to BASESYNTAX/DQSYNTAX (whichever is appropriate) when the parse
of a variable expansion starts, and finding a sensible way to change the
syntax back to BASESYNTAX/DQSYNTAX/ARISYNTAX when it ends. In FreeBSD
sh, an explicit stack of syntaxes is created for this, but that might be
avoidable: with slight modifications to what gets stored in the byte
after CTLVAR/CTLARI, it might be possible to go back through the parser
output to determine the syntax to revert to. I'll see if I can get that
working.
Cheers,
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html