Patrick Steinhardt <ps@xxxxxx> writes: > In any case, GCC is clever enough to notice what we're doing: > > fastlog2(unsigned long): > xor eax, eax > test rdi, rdi > je .L5 > bsr rax, rdi > .L5: > ret Nice. Aiming to compile to "bsr" is very good. > So with the following definition we're optimizing both with GCC and > Clang: > > size_t fastlog2(size_t sz) > { > size_t l = 0; > if (!sz) > return 0; > for (; sz; sz >>= 1) > l++; > return l; > } > > I'd thus say we can just pick that function instead of caring about > platform endianess with `ffs()`. The above loop that compilers seem to know to reduce to "bsr" is good. FWIW, because the definition of "first bit" in ffs() is "least significant bit", you do not need to worry about endianness at all, but what we want is most significant, so it is not directly usable.