Re: make ctype ascii only? (was [PATCH] kbuild: treat char as always signed)

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 26 Oct 2022 11:10:57 -0700

On Tue, Oct 25, 2022 at 5:10 PM Rasmus Villemoes
<linux@xxxxxxxxxxxxxxxxxx> wrote:
>
> Only very tangentially related (because it has to do with chars...): Can
> we switch our ctype to be ASCII only, just as it was back in the good'ol
> mid 90s

Those US-ASCII days weren't really very "good" old days, but I forget
why we did this (it's attributed to me, but that's from the
pre-BK/pre-git days before we actually tracked things all that well,
so..)

Anyway, I think anybody using ctype.h on 8-bit chars gets what they
deserve, and I think Latin1 (or something close to it) is better than
US-ASCII, in that it's at least the same as Unicode in the low 8
chars.

So no, I'm disinclined to go back in time to what I think is an even
worse situation. Latin1 isn't great, but it sure beats US-ASCII. And
if you really want just US-ASII, then don't use the high bit, and make
your disgusting 7-bit code be *explicitly* 7-bit.

Now, if there are errors in that table wrt Latin1 / "first 256
codepoints of Unicode" too, then we can fix those.

Not that anybody has apparently cared since 2.0.1 was released back in
July of 1996 (btw, it's sad how none of the old linux git archive
creations seem to have tried to import the dates, so you have to look
those up separately)

And if nobody has cared since 1996, I don't really think it matters.

But fundamentally, I think anybody calling US-ASCII "good" is either
very very very confused, or is comparing it to EBCDIC.

                 Linus