Re: dash bug: double-quoted "\" breaks glob protection for next char

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/03/2018 08:49, Herbert Xu wrote:
On Thu, Mar 01, 2018 at 08:24:22PM +0100, Harald van Dijk wrote:
On 01/03/2018 00:04, Harald van Dijk wrote:
$ bash -c 'x=yz; echo "${x#'"'y'"'}"'
z

$ dash -c 'x=yz; echo "${x#'"'y'"'}"'
yz

(That is, they are executing x=yz; echo "${x#'y'}".)

POSIX says that in "${var#pattern}" (and the same for ##, % and %%), the
pattern is considered unquoted regardless of the outer quotation marks.
Because of that, the single quote characters should not be taken
literally, but should be taken as quoting the y. ksh, posh and zsh agree
with bash.

Unfortunately, this causes another problem with all of the backslash
approaches so far:

   x='\\\\'; printf "%s\n" "${x#'\\\\'}"

This should print a blank line. (bash, ksh, posh and zsh agree.)

Here, dash's parser stores '$\$\', where $ is a control character. preglob
would need to turn this into \\\\\\\\. The problem is again that preglob
cannot increase the string length. Perhaps the parser needs to store this as
'$\$\$\$\', $ being either CTLESC or that new CTLBACK? Either way, it
requires some more invasive changes.

These are different issues.  dash's parser currently does not
understand nested quoting in patterns at all.  That is, if your
parameter expansion are within double quotes, then dash at the
parser level will consider the pattern to be double-quoted.  Thus
any nested single-quotes will be literals instead of actual quotes.

That's the same thing though. The problem with the backslashes is also that dash sees them as double-quoted when they should be seen as unquoted, and the approach taken in commit 7cfd8be0dc83342b4a71f3a8e5b7efab4670e50c that lasts to this day was specifically to *not* fix this in the parser, but to simply have the parser record enough information so that quote status can be determined and patched up during expansion. It's just that in the case of single quotes, expansion was never modified to recognise them. Thinking some more, I don't think the parser actually records enough information to let that work.

If we fix this in the parser then everything should just work.

Right, that's the approach FreeBSD sh has taken that I referred to in my message from Feb 18, that I'd personally prefer as well. It basically involves reverting 7cfd8be0dc83342b4a71f3a8e5b7efab4670e50c, setting syntax to BASESYNTAX/DQSYNTAX (whichever is appropriate) when the parse of a variable expansion starts, and finding a sensible way to change the syntax back to BASESYNTAX/DQSYNTAX/ARISYNTAX when it ends. In FreeBSD sh, an explicit stack of syntaxes is created for this, but that might be avoidable: with slight modifications to what gets stored in the byte after CTLVAR/CTLARI, it might be possible to go back through the parser output to determine the syntax to revert to. I'll see if I can get that working.

Cheers,
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux