On Tue, Jul 04, 2017 at 04:32:56PM +0100, Maciej W. Rozycki wrote: > On Tue, 4 Jul 2017, Amit Pundir wrote: > > > From: Yousong Zhou <yszhou4tech@xxxxxxxxx> > > > > commit 71a0a72456b48de972d7ed613b06a22a3aa9057f upstream. > > > > Some GCC versions (e.g. 4.8.3) can incorrectly inline a function with > > MIPS32 instructions into another function with MIPS16 code [1], causing > > the assembler to genereate incorrect binary code or fail right away > > complaining about unrecognized opcode. > > > > In the case of __arch_swab{16,32}, when inlined by the compiler with > > flags `-mips32r2 -mips16 -Os', the assembler can fail with the following > > error. > > > > {standard input}:79: Error: unrecognized opcode `wsbh $2,$2' > > > > For performance concerns and to workaround the issue already existing in > > older compilers, just ignore these 2 functions when compiling with > > mips16 enabled. > > > > [1] Inlining nomips16 function into mips16 function can result in > > undefined builtins, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55777 > > The patch is correct, however the description does not match reality. > There is no GCC bug involved here as: "Some GCC versions (e.g. 4.8.3) can > incorrectly inline a function [...]" would suggest, and GCC PR > target/55777 has nothing to do with it. > > Here you have inline functions including an inline asm each with assembly > instructions that do not have a MIPS16 representation. Unlike with GCC PR > target/55777 these functions are *not* marked with > `__attribute__((nomips16))', so the compiler is free to inline them into > any code, including MIPS16 code in particular, not being aware that the > inline asm is incompatible with MIPS16 assembly. It can't be aware as the > compiler is not an assembler and it does not interpret the string > representing assembly code to be produced from an inline asm (beyond > counting assembly instruction separators). > > Marking these functions with `__attribute__((nomips16))' would instruct > the compiler to compile these functions as well as any function they get > inlined into as regular MIPS code, which may or may not be desirable (plus > *then* you might hit GCC PR target/55777, and IIRC some other bugs in > older compilers). So excluding them from MIPS16 code instead seems like a > reasonable choice. > > OTOH, generic MIPS16 byte-swap code produced is awful, especially > `swab32' (and `swab64' does not apply as we don't support 64-bit MIPS16 > code under Linux), so perhaps for MIPS16 code these really ought to be > `__attribute__((nomips16, noinline))' instead, avoiding turning the caller > into regular MIPS code as well as any old `nomips16' function inlining > bugs; although the benefit might outweigh the cost if one of these > functions is called from an otherwise leaf function and spilling registers > to the stack turns out necessary where it otherwise would not be. > > Finally, for regular MIPS compilations contemporary versions of GCC > already produce the same assembly as our <uapi/asm/swab.h> provides, e.g. > from: > > $ cat swap.c > unsigned short int swap16(unsigned short int i) > { > return i << 8 | i >> 8; > } > > unsigned int swap32(unsigned int i) > { > return i << 24 | (i & 0xff00) << 8 | (i & 0xff0000) >> 8 | i >> 24; > } > > unsigned long long int swap64(unsigned long long int i) > { > return (i << 56 | (i & 0xff00LL) << 40 | > (i & 0xff0000LL) << 24 | (i & 0xff000000LL) << 8 | > (i & 0xff00000000LL) >> 8 | (i & 0xff0000000000LL) >> 24 | > (i & 0xff000000000000LL) >> 40 | i >> 56); > } > $ > > you get: > > $ gcc -O2 -mabi=64 -march=mips64r2 -c swap.c > $ objdump -d swap.o > > swap.o: file format elf64-tradbigmips > > > Disassembly of section .text: > > 0000000000000000 <swap16>: > 0: 7c0410a0 wsbh v0,a0 > 4: 03e00008 jr ra > 8: 3042ffff andi v0,v0,0xffff > c: 00000000 nop > > 0000000000000010 <swap32>: > 10: 7c0410a0 wsbh v0,a0 > 14: 03e00008 jr ra > 18: 00221402 ror v0,v0,0x10 > 1c: 00000000 nop > > 0000000000000020 <swap64>: > 20: 7c0410a4 dsbh v0,a0 > 24: 03e00008 jr ra > 28: 7c021164 dshd v0,v0 > 2c: 00000000 nop > $ > > so I think we ought to make all this conditional on GCC being old enough, > as the compiler is always better suited to make code scheduling decisions > when there's no inline asm involved. Ok, but I'm not changing the changelog comments from the original patch :) If you think a follow-on patch is needed in the tree, please submit it and mark it for stable inclusion. thanks, greg k-h