On Tue, Sep 03, 2024 at 11:23:12AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@xxxxxx> writes: > > > I also noticed `log2i()`, but honestly the only reason why I didn't > > reuse it is that I had no clue where to put it. There isn't any header > > that would be a good fit for it, and creating a new "math.h" header for > > a single function felt overblown to me. So I decided to just not bother. > > I'm happy to adjust though if somebody has a suggestion for where to put > > it. > > Given the existing contents of wrapper.h (near the end of the file), > I think wrapper.c would be a good place to do so. Indeed. Quite a grab bag of functions, thanks for the hint! > Shouldn't this essentially be a call to ffs() with the argument > tweaked, by the way? We are looking for the last set bit, not for the first set bit. We can massage things a bit, but wouldn't that require us to be aware of the platforms endianess? In any case, GCC is clever enough to notice what we're doing: fastlog2(unsigned long): xor eax, eax test rdi, rdi je .L5 bsr rax, rdi .L5: ret The `log2i()` function looks a bit less efficient: log2i(unsigned long): test rdi, rdi je .L3 bsr rdi, rdi lea eax, [rdi+1] cdqe ret .L3: xor eax, eax ret Clang isn't yet clever enough with v18.1, but is with trunk: log2i(unsigned long): bsr rax, rdi inc rax test rdi, rdi cmove rax, rdi ret fastlog2(unsigned long): test rdi, rdi je .LBB1_1 shr rdi bsr rcx, rdi mov eax, 127 cmovne rax, rcx xor eax, -64 add eax, 65 ret .LBB1_1: mov eax, -1 ret So with the following definition we're optimizing both with GCC and Clang: size_t fastlog2(size_t sz) { size_t l = 0; if (!sz) return 0; for (; sz; sz >>= 1) l++; return l; } I'd thus say we can just pick that function instead of caring about platform endianess with `ffs()`. Patrick