Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Maciej, first of all, thank you for your time on this,
appreciate it.

Comments inline

On 5 September 2015 at 22:25, Maciej W. Rozycki <macro@xxxxxxxxxxxxxx> wrote:
> On Sat, 5 Sep 2015, Yousong Zhou wrote:
>
>> >  This can't be true.  The compiler does not intepret the contents of an
>> > inline asm and therefore cannot decide whether to inline a function
>> > containing one or not based on the lone presence or the absence of an
>> > assembly directive within.
>> >
>>
>> Most of the time I trust my compiler and never meddle with the
>> toolchain.  Anyway I made a patch because it really did not work for
>> me.   No big deal.  It's not the end of world.  It started with a
>> comment from OpenWrt packages feeds [1].  Actually this "unrecognized
>> opcode" problem have occurred within OpenWrt quite a few times before.
>>
>>  [1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931
>
>  The bug certainly was there, it's just your analysis and consequently the
> fix that are wrong in the general case for some reason, maybe a buggy
> compiler.
>

This is the compiler "--version",

    mips-openwrt-linux-gcc (OpenWrt/Linaro GCC 4.8-2014.04 r46763) 4.8.3

>> >  It looks to me you're papering something over here -- when you use a
>> > `.set nomips16' directive the assembler will happily switch your
>> > instruction set in the middle of an instruction stream.  Consequently if
>> > this function is (for whatever reason; it should not really) inlined in
>> > MIPS16 code, then you'll get a MIPS32 instruction in the middle, which
>> > will obviously be interpreted differently in the MIPS16 execution mode and
>> > is therefore surely a recipe for disaster.
>>
>> If by "papering" you mean "made up", then whatever.  Yeah, it's
>> disaster, an "invalid instruction" abort.
>
>  By "papering over" I mean forcing source code to compile successfully at
> the risk of producing incorrect binary code.

Learned a new phrase/idiom :)

>
>> >  How did you test your change and made the conclusion quoted with your
>> > patch description?
>> >
>>
>> Compile the following program with a MIPS 24kc big endian variant compiler with
>> flag "-mips32r2 -mips16 -Os".
>>
>>     #include <stdio.h>
>>     #include <stdint.h>
>>
>>     uint16_t __attribute__((noinline)) f(uint16_t v)
>>     {
>>         v = __cpu_to_le16(v);
>>         return v;
>>     }
>>
>>     int main()
>>     {
>>         printf("%x\n", f(0xbeef));
>>
>>         return 0;
>>     }
>>
>> When only ".set nomips16" was specified in __arch_swab16(), this was output
>> from objdump.
>>
>>     242 004003e0 <f>:
>>     243   4003e0:       7c0410a0        wsbh    v0,a0
>>     244   4003e4:       e820ea31        swc2    $0,-5583(at)
>>     245   4003e8:       65006500        0x65006500
>>     246   4003ec:       65006500        0x65006500
>
>  Quite obviously.
>
>  For the record: the first instruction has been assembled in the regular
> MIPS mode and that propagated to symbol annotation.  Consequently `f' is
> seen by `objdump' as a regular MIPS function and disassembles it all as
> such.  You can put a global label immediately after the WSBH instruction
> in your source code to have the rest of the function disassembled
> correctly (of course this won't make this code work at the run time).

Thanks for the trick.  Objdump really disassembled it
correctly.

>
>> __arch_swab16() was indeed inlined.  That swc2 instruction can be got from
>> assembler with the following code (it's from the "-S" result of GCC).
>>
>>     .set    mips16
>>     .set    noreorder
>>     .set    nomacro
>>     j       $31
>>     zeh     $2
>>
>> When only nomips16 function attribute was specified, this time GCC failed with
>> unrecognized opcode
>>
>>     /tmp/ccaGCouL.s: Assembler messages:
>>     /tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4'
>>
>> The generated assembly was something in the following form.  Looks like the
>> assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2"
>
>  There's no switch to regular MIPS mode implied with `.set arch=mip32r2',
> the directive merely switches the ISA level, affecting both the regular
> MIPS and the MIPS16 mode (the MIPS32r2 ISA level adds extra MIPS16
> instructions too, e.g. ZEH is a new addition).

Agreed

>
>>         .set mips16
>>
>>         .ent f
>>         .type f, @function
>>     f
>>         ...
>>         .set push
>>         .set arch=mips32r2
>>         wsbh $2,$4
>>         .pop
>>         j $31
>>         zeh $2
>>         .end f
>>         ...
>
>  That's exactly the papered-over buggy code scenario I've been referring
> to above.  This is clearly MIPS16 code: ZEH is a MIPS16 instruction only,
> there's no such regular MIPS counterpart.  And it also obviously fails to
> assemble because on the contrary there's no MIPS16 WSBH instruction.
>

Yes, it won't assemble.

>  Now if you stick `.set nomips16' just above WSBH, then this code will
> happily assemble, because this single instruction only (`.set pop' reverts
> any previous `.set' directives; I'm assuming you wrote above by hand and
> `.pop' is a typo) will assemble in the regular MIPS instruction mode.  But
> if this code is ever reached, then the processor will still execute the
> machine code produced by the assembler from the WSBH instruction in the
> MIPS16 mode.

Yes, I hand-copied it from the output of "gcc -S" just to
show the form/pattern (the original output is too long for
this conversation).  No, that `.pop' is not a typo (I just
did a double-check).

If I stick `.set nomips16` just above WSBH, that's just what
the original patch tries to do, papering over the fact that
it did not compile/assemble without it.

>
>  For example the encoding of:
>
>         wsbh    $2, $4
>
> is (as you've shown in a dump above) 0x7c0410a0, which in the MIPS16 mode
> yields (in the big-endian mode):
>
> 00000000 <f>:
>    0:   7c04            sd      s0,32(a0)
>    2:   10a0            b       144 <f+0x144>
>
> -- so this won't do what you might expect, you'll get a Reserved
> Instruction exception on the SD instruction, which is not supported in the
> 32-bit mode, and consequently SIGILL.

Agreed.

>
>> The patch was run tested on QEMU Malta and an router with Atheros
>> AR9331 SoC.  I didn't test __arch_swab64() though.  I have done many
>> other trial-and-error tests while preparing this patch.  It was a mess
>> when I was sure I should expect some sensible behaviour from the
>> compiler while it actually just did not behave that way.
>
>  I've compiled your example provided and as I stated in the original mail
> `__arch_swab16' is always produced as a separate function, whether `.set
> nomips16' is present in the inline assembly placed there or not.  This is
> with (unreleased) GCC 6.0.0.
>
>  However if you happen to have a buggy compiler that fails to emit
> `__arch_swab16' as a separate function despite the `nomips16' attribute,
> then it's better if the resulting generated assembly code fails to
> assemble rather than if it goes astray at the run time.

Now, my compiler refused to emit `__arch_swab16' as a
separate function even with the `nomips16' function
attribute.  This is the kind of "mess" I just meant above, again.
I expect that it should emit a separate function and call it
with a jump observing that the caller `f' is in MIPS16 mode
yet the `__arch_swab16' is noMIPS16.  sigh~~

>
>> >  So setting aside the correctness issues discussed above, for MIPS16 code
>> > this has to be put out of line by the compiler, with all the usual
>> > overhead of a function call, causing a performance loss rather than a gain
>> > expected here.  Especially as switching the ISA mode requires draining the
>> > whole pipeline so that subsequent instructions are interpreted in the new
>> > execution mode.  This is expensive in performance terms.
>> >
>> >  I'm fairly sure simply disabling these asms (#ifndef __mips16) and
>> > letting the compiler generate the right mask and shift operations from
>> > plain C code will be cheaper in performance terms and possibly cheaper in
>> > code size as well, especially in otherwise leaf functions where an extra
>> > function call will according to the ABI clobber temporaries.  So I suggest
>> > going in that direction instead.
>>
>> I agree.  Then you will provide the fix right?  I am just curious
>> where that __mips16 should be placed or is it from compiler and
>> assembler?
>
>  No, it's your bug after all.  I think the last paragraph I wrote quoted
> above combined with the source code in question make it clear what to do.

Okay, I will try.  Most of the time when textbooks read
clearly/obviously/apparently, things go astray ;)

>
>  I have also checked what the difference in generated MIPS16 code is
> between a call to `__arch_swab16':
>
> 0000000c <f>:
>    c:   64c4            save    32,ra
>    e:   1800 0000       jal     0 <__arch_swab16>
>                         e: R_MIPS16_26  __arch_swab16
>   12:   ec31            zeh     a0
>   14:   6444            restore 32,ra
>   16:   e8a0            jrc     ra
>
> and equivalent generic code:
>
> 0000000c <f>:
>    c:   ec31            zeh     a0
>    e:   3280            sll     v0,a0,8
>   10:   3482            srl     a0,8
>   12:   ea8d            or      v0,a0
>   14:   e820            jr      ra
>   16:   ea31            zeh     v0
>
> so the win is I think clear.
>
>  Finally the MIPS64 `__arch_swab64' case does not matter as we have no
> MIPS16 support for 64-bit code in Linux, the toolchain will simply refuse
> to build it (only bare metal is supported).

Thanks for the information.  Never played with MIPS64
before.

Regards

                yousong




[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux