GCC is smart enough to substitute the final result for FLS calculations as implemented in the fallback C code we have in `__fls' and `fls' applied to constant values. The presence of inline asm defeats the compiler though, forcing it to emit extraneous CLZ/DCLZ calculation for processors that support these instructions. Use `__builtin_constant_p' then to avoid inline asm altogether for constants. Signed-off-by: Maciej W. Rozycki <macro@xxxxxxxxxxxxxx> --- Ralf, Even my good old trusty 4.1.2 can do it... Maciej linux-mips-const-fls.diff Index: linux/arch/mips/include/asm/bitops.h =================================================================== --- linux.orig/arch/mips/include/asm/bitops.h 2015-04-02 20:18:51.384515000 +0100 +++ linux/arch/mips/include/asm/bitops.h 2015-04-02 20:27:54.273191000 +0100 @@ -481,7 +481,7 @@ static inline unsigned long __fls(unsign { int num; - if (BITS_PER_LONG == 32 && + if (BITS_PER_LONG == 32 && !__builtin_constant_p(word) && __builtin_constant_p(cpu_has_clo_clz) && cpu_has_clo_clz) { __asm__( " .set push \n" @@ -494,7 +494,7 @@ static inline unsigned long __fls(unsign return 31 - num; } - if (BITS_PER_LONG == 64 && + if (BITS_PER_LONG == 64 && !__builtin_constant_p(word) && __builtin_constant_p(cpu_has_mips64) && cpu_has_mips64) { __asm__( " .set push \n" @@ -559,7 +559,8 @@ static inline int fls(int x) { int r; - if (__builtin_constant_p(cpu_has_clo_clz) && cpu_has_clo_clz) { + if (!__builtin_constant_p(x) && + __builtin_constant_p(cpu_has_clo_clz) && cpu_has_clo_clz) { __asm__( " .set push \n" " .set "MIPS_ISA_LEVEL" \n"