RE: [PATCH 3/5] MIPS: LLVMLinux: Fix an 'inline asm input/output type mismatch' error.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Apologies for the slow response. I've had an excessive amount of meetings in the last couple days.

> -----Original Message-----
> From: Maciej W. Rozycki [mailto:macro@xxxxxxxxxxxxxx]
> Sent: 04 February 2015 12:58
> To: Daniel Sanders
> Cc: Toma Tabacu; Ralf Baechle; Markos Chandras; Leonid Yegoshin; linux-
> mips@xxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 3/5] MIPS: LLVMLinux: Fix an 'inline asm input/output
> type mismatch' error.
> 
> On Tue, 3 Feb 2015, Daniel Sanders wrote:
> 
> > From: Toma Tabacu <toma.tabacu@xxxxxxxxxx>
> >
> > Change the type of csum_ipv6_magic's 'proto' argument from unsigned
> > short to __u32.
> >
> > This fixes a type mismatch between the 'htonl(proto)' inline asm
> > input, which is __u32, and the 'proto' output, which is unsigned
> > short.
> >
> > This is the error message reported by clang:
> > arch/mips/include/asm/checksum.h:285:27: error: unsupported inline asm:
> input with type '__be32' (aka 'unsigned int') matching output with type
> 'unsigned short'
> >           "0" (htonl(len)), "1" (htonl(proto)), "r" (sum));
> >                                  ^~~~~~~~~~~~
> >
> > The changed code can be compiled successfully by both gcc and clang.
> 
>  This definitely looks like a bug in clang to me.  What this construct
> means is both input #5 and output #1 live in the same register, and that
> an `__u32' value is taken on input (from the result of the `htonl(proto)'
> calculation) and an `unsigned short' value produced in the same register
> on output, that'll be the value of the `proto' variable from there on.  A
> perfectly valid arrangement.  This would be the right arrangement to use
> with the MIPS16 SEH instruction for example.  Has this bug been reported
> to clang maintainers?

I'm not convinced it's a bug, but I do at least agree that the use case sounds
sensible. It makes sense to me that the focus should be on register allocations
rather than on types. However, the relevant clang source is being very specific
about the cases it is/isn't allowing which suggests it's deliberate. I've started a
thread on the clang mailing list to try to find out more about why we currently
reject it.

>  And I'd prefer to leave the declaration of `proto' alone as IPv6 network
> protocol numbers are 16-bit quantities.
>
>  That said this code is indeed weird if not wrong, which is probably why
> this arrangement resulted, in an attempt to prevent GCC from messing up
> the registers used.
> 
>  First and foremost both outputs, and especially #1, lack an earlyclobber.
> This I imagine may have prompted GCC to overwrite one of the inputs, which
> in turn is why whoever poked at this code decided to alias input #5 to
> output #1.  But as you can see in the asm there's no real aliasing between
> input #5 and output #1.  Input #5 is consumed early on (and even referred
> to with `%5' rather than `%1', which would be the norm in the case of
> actual aliasing), and the containing register reused for something else.
> So the two operands can be separated.  This is unlike input #4 vs output
> #0, that is both read and written right away (and just as one'd expect
> there's no reference to `%4' anywhere).
> 
>  Output #0 can do without an earlyclobber as it is aliased to input #4 and
> therefore cannot be assigned by GCC to another input.  But it won't hurt
> to have one too and it will set a good practice and serve a documentation
> purpose.
> 
>  I suggest a fix like this then:
> 
> static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
> 					  const struct in6_addr *daddr,
> 					  __u32 len, unsigned short proto,
> 					  __wsum sum)
> {
> 	__wsum tmp;
> 
> 	__asm__(
> [...]
> 	: "=&r" (sum), "=&r" (tmp)
> 	: "r" (saddr), "r" (daddr),
> 	  "0" (htonl(len)), "r" (htonl(proto)), "r" (sum));
> 
>         return csum_fold(sum);
> }
> 
> Try and see if it works for you.
> 
>  I wonder why this is an asm in the first place though.  There's no rocket
> science here that GCC couldn't handle.  I guess it must have been very bad
> at optimising a C equivalent then.
> 
>   Maciej

Yes, that works for me on both GCC and Clang. I'll change the patch to this.
Would you like a 'Suggested-By' in the patch description?





[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux