I've seen this issue in interop with modernish, where dash doesn't properly support multibyte character encodings. It would be great to have proper support for more interesting character sets. It's worth looking at how other shells implement this, but a natural choice is to choose a more sophisticated representation of string, where control characters are represented out-of-band rather than by using 'unused' values (well, unused by ASCII, but not by others). Unfortunately, this rope-like representation is not very convenient to work with. Cheers, Michael On 2023-01-11 at 02:01:03 AM, Harald van Dijk wrote: > Hi, > > Please consider > > alias $(printf "\204")="exit 2" > $(:) > echo ok > > This is a perfectly valid shell script. The alias command is permitted > to either succeed or fail, and if it succeeds, it defines an alias whose > name is the single byte '\204'. No command by that name is ever > executed, so this script is required to then print 'ok' and exit > successfully. > > This is not what happens. Internally, '\204' is the value of CTLBACKQ, > and the word $(:) gets translated to an internal representation of just > that -- coupled with a pointer to the parsed command. Since the word > $(:) contains zero quote characters, it is subjected to alias expansion, > and picks up this alias definition that makes the command expand to > exit 2. > > Consider also: > > alias $(printf "\201")="echo ok" $(printf "\201\201")="echo bad" && > eval $(printf "\201") > > This should either print "ok", or reject the aliases. Instead, it prints > "bad". This happens because '\201' is the internal representation of > CTLESC, and a literal byte of that value is represented by escaping it > with CTLESC. Therefore, it triggers the expansion of the \201\201 alias. > > Supporting alias names containing non-ASCII characters, while not > required by POSIX, seems desirable, and almost all other shells (mksh > being the exception) do appear to support this. I am not yet seeing a > good way of solving this. > > Cheers, > Harald van Dijk