On Tue, Jul 26, 2022 at 11:18 AM Yury Norov <yury.norov@xxxxxxxxx> wrote:
We have find_bit_benchmark to check how it works in practice. Would be great if someone with access to the hardware can share numbers.
Honestly, I doubt benchmarking find_bit in a loop is all that sensible. These are helper functions that are probably seldom super-hot in cache, and as with so many of these things, I suspect the cool-I$ numbers are the ones that matter most in real life. When some filesystem ends up searching for a free block or similar, it will probably have done other things before that that means that the L1 I$ has been long flushed, and branch history is quite possibly entirely gone too. The same is quite possibly true for the bitmap itself in D$ too. That said, looking at the x86 code generation (not only because I have the build environment, but where I can actually read make sense of the asm), the only thing that looks bad is the conditional bswap. So the code gen looks fairly good, Linus