Re: [PATCH 0/8] Add multi-byte support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2024-04-28 at 08:49 +0800, Herbert Xu wrote:
> 
> Are you talking about a theoretical undefined condition, or an
> actual one?  Which shell doesn't deal with ${foo%.} correctly?

Well my main point for this mail was how dash does it (and not just
about '.').
I guess it simply resorts to fnmatch(3) so it will probably do whatever
the system's libc does?

But it's really not just about '.' (which is more or less a rather safe
case).
If someone assumes dash would always be LC_ALL=C, that any such
operation where e.g. a byte is used that is part of a multi-byte
character might, AFAIU, lead to unexpected results.
E.g. an fnmatch implementation may just decide to stop and give an
error at the first byte that's not a valid characters, right?

Also there were some locales like Big5, which had the weird property of
having multibyte chars that contain byte sequences that form other
valid chars (see [0]).
Not sure if I remember that correctly, but it might have been undefined
when stripping of the "shorter" character from that.
Which again, couldn't have happened so far in dash, as it simply was C
local only.


Harald van Dijk made some extensive tests back then, how different
shells behave.
I think the austrin-group-l mailing list archive is not publicly
available, but if you have an account it was in that mail:
https://collaboration.opengroup.org/operational/mailarch.php?soph=N&action=show&archive=austin-group-l&num=34339&limit=100&offset=0
which showed that shells do indeed behave differently (the tests
weren't for '.').


Cheers,
Chris.

[0] https://unix.stackexchange.com/questions/383217/shell-keep-trailing-newlines-n-in-command-substitution/383411#383411





[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux