From: Linus Torvalds
Sent: 21 December 2022 17:07
On Wed, Dec 21, 2022 at 7:56 AM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
The above assumes an unsigned char as input to strcmp(). I consider that
a hypothetical problem because "comparing" strings with upper bits
set doesn't really make sense in practice (How does one compare Günter
against Gunter ? And how about Gǖnter ?). On the other side, the problem
observed here is real and immediate.
POSIX does actually specify "Günter" vs "Gunter".
The way strcmp is supposed to work is to return the sign of the
difference between the byte values ("unsigned char").
But that sign has to be computed in 'int', not in 'signed char'.
So yes, the m68k implementation is broken regardless, but with a
signed char it just happened to work for the US-ASCII case that the
crypto case tested.
I think the real fix is to just remove that broken implementation
entirely, and rely on the generic one.
I wonder how much slower it is - m68k is likely to be microcoded
and I don't think instruction timings are actually available.
The fastest version probably uses subx (with carry) to generate
0/-1 and leaves +delta for the other result - but getting the
compares and branches in the right order is hard.
I believe some of the other m68k asm functions are also missing
the "memory" 'clobber' and so could get mis-optimised.
While I can write (or rather have written) m68k asm I don't have
a compiler.
I also suspect that any x86 code that uses 'rep scas' is going
to be slow on anything modern.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)