On 2024-12-02 12:26, G. Branden Robinson wrote:
Why shouldn't the test succeed? Solaris 10 /usr/bin/tr supports
character classes like [:cntrl:].
It doesn't for me in the instance at gcc210.fsffrance.org.
It works for me there:
$ uname -a
SunOS gcc-solaris10 5.10 Generic_Virtual sun4u sparc
SUNW,SPARC-Enterprise
$ printf 'a\003b\n' | /usr/bin/tr '[:cntrl:]' '[ *]'
a b $
Solaris 10 /usr/bin/tr has trouble in multibyte locales. Perhaps that's
your problem? 'configure' should set LC_ALL=C early on, though, so if
this is your problem it suggests that part of your script is messing
with the locale, when it shouldn't be.
I installed the attached doc patch into Autoconf master to document more
of the tr issues.From b40645caa91dad69ba8a14ef53dc0013e12497fc Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@xxxxxxxxxxx>
Date: Mon, 2 Dec 2024 12:47:54 -0800
Subject: [PATCH] doc: mention tr issues in multi-byte locales
* doc/autoconf.texi (tr): Mention multi-byte issues.
---
doc/autoconf.texi | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 222647b8..dd0b1fa2 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -19985,6 +19985,14 @@ timestamp truncation problems that @samp{cp -p} has.
@item @command{tr}
@c ---------------
@prindex @command{tr}
+
+Many @command{tr} implementations do not support multi-byte locales
+well. For example, Solaris 10 @command{tr} rejects character classes in
+multi-byte locales. Also, ranges have well-defined behavior only in the
+@samp{C} (or @samp{POSIX}) locale, so if you cannot guarantee the
+setting of @env{LC_ALL} it is better to spell out a range
+@samp{[ABCDEFGHIJKLMNOPQRSTUVWXYZ]} than to rely on @samp{[A-Z]}.
+
@cindex carriage return, deleting
@cindex newline, deleting
@cindex deleting carriage return
--
2.43.0