On 29/11/16 03:18 AM, Jochen Rollwagen wrote: > This commit replaces the loop for calculating log base 2 for > non-x86-platforms in radeon.h with a clz (count leading zeroes)-based > version to simplify the code and, well, eliminate the loop. > Note: Thereâ??s no check for val=0 case, since x86-bsr is undefined for > that case too, that should be okay. > --- > src/radeon.h | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/src/radeon.h b/src/radeon.h > index cbc7866..b1a1ce0 100644 > --- a/src/radeon.h > +++ b/src/radeon.h > @@ -933,17 +933,16 @@ enum { > static __inline__ int > RADEONLog2(int val) > { > - int bits; > #if (defined __i386__ || defined __x86_64__) && (defined __GNUC__) > + int bits; > + > __asm volatile("bsrl %1, %0" > : "=r" (bits) > : "c" (val) > ); > return bits; > #else > - for (bits = 0; val != 0; val >>= 1, ++bits) > - ; > - return bits - 1; > + return (31 - __builtin_clz(val)); > #endif > } Any reason for not using __builtin_clz on x86 as well? AFAICT both gcc and clang seem to generate more or less the same code with that as with the inline assembly. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer