On Wed, Sep 08, 2010 at 01:26:15AM +0400, Alexey Zinovyev wrote: > Hello, I think there is a bug in read() builtin. > $ cat test > echo 'ρ'|while read i; do echo $i; done > $ dash test > $ bash test > ρ > Same with some japanese symbols. > Looks like dash strips 0x81 byte. 0x81 == CTLESC, the escape character in dash's internal representation. > diff --git a/src/miscbltin.c b/src/miscbltin.c > index 5ab1648..f8c5655 100644 > --- a/src/miscbltin.c > +++ b/src/miscbltin.c > @@ -101,7 +101,6 @@ readcmd_handle_line(char *line, char **ap, size_t len) > * will not modify the length of the string */ > offset = sl->text - s; > remainder = backup + offset; > - rmescapes(remainder); > setvar(*ap, remainder, 0); > > return; This patch is not correct as it will leave 0x81 bytes for backslash escapes. That is probably a bit worse than ignoring the backslashes entirely, which is what it does now. It attempts to "escape" the next character by placing a CTLESC, but CTLESC does not and should not escape IFS characters for ifsbreakup(); the recordregion() mechanism should be used for that. (For the intermediate representation generated by parser.c, CTLESC does escape IFS characters. This is not ideal as it prevents IFS splitting with CTL* bytes in word in ${var+-word}.) The patch I posted separately fixes the handling of 0x81 and various other issues with read (by using separate code instead of trying to use expand.c). Backslash escaping works too although I have just found some bugs with corner cases. -- Jilles Tjoelker -- To unsubscribe from this list: send the line "unsubscribe dash" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html