On Fri, Mar 2, 2018 at 7:03 PM, Harald van Dijk <harald@xxxxxxxxxxx> wrote: > On 02/03/2018 18:00, Denys Vlasenko wrote: >> >> On Wed, Feb 14, 2018 at 9:03 PM, Harald van Dijk <harald@xxxxxxxxxxx> >> wrote: >>> >>> Currently: >>> >>> $ dash -c 'foo=a; echo "<${foo#[a\]]}>"' >>> <> >>> >>> This is what I expect, and also what bash, ksh and posh do. >>> >>> With your patch: >>> >>> $ dash -c 'foo=a; echo "<${foo#[a\]]}>"' >>> <a> >> >> I was looking into this specific example and I believe it is a _bash_ bug. >> >> The [a\]] is misinterpreted by it (and probably by many people). >> The gist is: \] is not a valid escape for ] in set glob expression. >> Glob sets have no escaping at all, ] can be in a set >> if it is the first char: []abc], >> dash can be in a set if it is first or last: [abc-], >> [ and \ need no protections at all: [a[b\c] is a valid set of 5 chars. >> >> Therefore, "[a\]]" glob pattern means "a or \, then ]". >> Since that does not match "a", the result of ${foo#[a\]]}> should be "a". > > Are you sure about this? "Patterns Matching a Single Character"'s first > paragraph contains "A <backslash> character shall escape the following > character. The escaping <backslash> shall be discarded." The shell does this > first. I have problems with "The shell does this first" statement. It's useful to view the entire discussion of glob pattern matching as a discussion of how fnmatch(pattern, string, flags) should behave (even if a particular shell implementation chose to not use C library's fnmatch() to implement its globbing). Otherwise (IOW: if you allow gobbing to depend on shell's quoting), rules for globbing for different applications will not be consistent. Which would be bad. As I see it, shell should massage input according to shell rules (quote/bkslash removal et al), then use fnmatch() or glob(), or its own internal implementations of them. bash seems to not do it. It probably has a "combined" routine which does both in one step, which allows quote removal to interfere with globbing. Here's the proof: $ x='a]'; echo _${x#[a\]]}_ _]_ In the above code, what pattern should be fed to fnmatch(), assuming shell uses fnmatch() to implement ${x#pattern}? Pattern should be "[a]]" because by shell rules "\]" in an unquoted string is "]". But try this: $ x='a]'; echo _${x#[a]]}_ __ Here, pattern should be "[a]]" as well - it literally is. But the results are different! Evidently, bash does _not_ perform quote removal (more precisely, backslash removal) on pattern string. Somehow, globbing code knows \ was there. (And this globbing code, in my opinion, also misinterprets [a\]] as "set of 'a' or ']'", but (a) I might be wrong on this, and (b) this is a bit offtopic, we discuss ${x#pattern} handling here). To me, it looks that bash behavior is buggy regardless of what \] means in glob patterns. These two should be equivalent: x='a]'; echo _${x#[a\]]}_ x='a]'; echo _${x#[a]]}_ because they should use the same pattern for globbing match. Alternative possibility is that pattern in ${x#pattern} is not handled by the usual shell rules: backslashes are not removed. This would be VERY ugly as soon as nested variable expansions are considered. -- To unsubscribe from this list: send the line "unsubscribe dash" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html