RE: [PATCH v3] serial: make uart_console_write->putchar()'s character an unsigned char

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 3 Mar 2022, David Laight wrote:

> > And indeed it happens with the MIPS target:
> > 
> > 803ae47c:	82050000 	lb	a1,0(s0)
> > 803ae480:	26100001 	addiu	s0,s0,1
> > 803ae484:	02402025 	move	a0,s2
> > 803ae488:	0220f809 	jalr	s1
> > 803ae48c:	30a500ff 	andi	a1,a1,0xff
> > 
> > vs current code:
> > 
> > 803ae47c:	82050000 	lb	a1,0(s0)
> > 803ae480:	26100001 	addiu	s0,s0,1
> > 803ae484:	0220f809 	jalr	s1
> > 803ae488:	02402025 	move	a0,s2
> > 
> > (NB the last instruction shown after the call instruction, JALR, is in the
> > delay slot that is executed before the PC gets updated).  Now arguably the
> > compiler might notice that and use an unsigned LBU load instruction rather
> > than the signed LB load instruction, which would make the ANDI instruction
> > redundant, but still I think we ought to avoid gratuitous type signedness
> > changes.
> > 
> >  So I'd recommend changing `s' here to `const unsigned char *' or, as I
> > previously suggested, maybe to `const u8 *' even.
> 
> Or just not worry that the 'char' value (either [128..127] or [0..255])
> is held in a 'signed int' variable.
> That basically happens every time it is loaded into a register anyway.

 That might be true with a hypothetical 8-bit ABI on top of a higher-width 
machine architecture.  It does happen with the 32-bit MIPS ABI (o32) and a 
64-bit architecture, which is why LW (load word signed) is a universal 
32-bit and 64-bit instruction while the LWU (load word unsigned) operation 
is restricted to 64-bit code.

 In this case however a signed `char' value ([-128..127]) is sign-extended 
while an unsigned `char' value ([0..255]) is zero-extended, even though 
both are carried in a 'signed int' variable from the architecture's point 
of view.

 Anyway I have looked into it some more and the immediate cause for LBU 
not to be used here is the:

		if (*s == '\n')
			putchar(port, '\r');

conditional.  If this part is removed, then LBU does get used for the:

		putchar(port, *s);

part and no ANDI is produced.

 The reason for that is that the compiler decides to reuse the load used 
to evaluate (*s == '\n') (which is done using the plain `char' data type) 
for the following `putchar(port, *s)' call if the expression used as the 
condition turns out to be false and therefore the value of `*s' has to be 
subsequently zero-extended:

      b4:	00e08825 	move	s1,a3
      b8:	2413000a 	li	s3,10
      bc:	82050000 	lb	a1,0(s0)
      c0:	00000000 	nop
      c4:	14b30005 	bne	a1,s3,dc <uart_console_write+0x54>
      c8:	00000000 	nop
      cc:	2405000d 	li	a1,13
      d0:	0220f809 	jalr	s1
      d4:	02402025 	move	a0,s2
      d8:	82050000 	lb	a1,0(s0)
      dc:	26100001 	addiu	s0,s0,1
      e0:	02402025 	move	a0,s2
      e4:	0220f809 	jalr	s1
      e8:	30a500ff 	andi	a1,a1,0xff

(the load at bc is reused for the `putchar' call at e4 unless it's `\n', 
or otherwise the character is reloaded at d8).

 By using a temporary `unsigned char' variable and massaging the source 
code suitably GCC can be persuaded to use LBU instead, but the obfuscation 
of the source code and the resulting machine code produced seem not worth 
the effort IMO, so let's keep it simple.

 JFTR,

  Maciej



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux