[BUG] ${#var} returns length in bytes, not characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



POSIX:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02
> ${#parameter}
> String Length. The length in characters of the value of parameter
> shall be substituted. [...]

dash does not expand the length in characters; it expands the length in
bytes instead. That is invalid for locales that include multi-byte
characters, such as the now ubiquitous UTF-8 set.

Test case:

$ locale
LANG="nl_NL.UTF-8"
LC_COLLATE="nl_NL.UTF-8"
LC_CTYPE="nl_NL.UTF-8"
LC_MESSAGES="nl_NL.UTF-8"
LC_MONETARY="nl_NL.UTF-8"
LC_NUMERIC="nl_NL.UTF-8"
LC_TIME="nl_NL.UTF-8"
LC_ALL=
$ word='bètatest'	# length: 8
$ echo ${#word}
9

Expected output: 8
Got output: 9

(bash, ksh93, mksh, and zsh all do this correctly.)

- Martijn
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux