This series adds a compile test to make sure that all the bitops operations (namely __ffs(), ffs(), ffz(), __fls(), fls(), fls64()) correctly fold constant expressions given that their argument is also a constant expression. The other functions from bitops.h are out of scope. So far, only the n68k and the hexagon architectures lack such optimization. To this extend, the first two patches optimize m68k architecture, the third and fourth optimize the hexagon architecture bitops function. The fifth and final patch adds the compile time tests to assert that the constant folding occurs and that the result is accurate. This is tested on arm, arm64, hexagon, m68k, x86 and x86_64. For other architectures, I am putting my trust into the kernel test robot to send a report if ever one of these still lacks bitops optimizations. The kernel test robot did not complain on v3, giving me confidence that all architectures are now properly optimized. --- ** Changelog ** v3 -> v4: - Only apply the __always_inline to the bit-find functions, do not touch other functions from bitops.h. I discovered that the benchmark done in the v3 was incorrect (refer to the thread for details). The scope was thus narrowed down to the bit-find functions for which I could demonstrate the gain in the benchmark. - Add benchmark for hexagon (patch 3/5 and 4/5). Contrarily to the m68k benchmark which is with an allyesconfig, the hexagon benchmark uses a defconfig. The reason is just that the allyesconfig did not work on first try on my environment (even before applying this series), and I did not spent efforts to troubleshoot. - Add Geert review tag in patch 2/5. Despite also receiving the tag for patch 1/5, I did not apply due to new changes in that patch. - Do not split the lines containing tags. Link: https://lore.kernel.org/all/20231217071250.892867-1-mailhol.vincent@xxxxxxxxxx/ v2 -> v3: - Add patches 1/5 and 2/5 to optimize m68k architecture bitops. Thanks to the kernel test robot for reporting! - Add patches 3/5 and 4/5 to optimize hexagon architecture bitops. Thanks to the kernel test robot for reporting! - Patch 5/5: mark test_bitops_const_eval() as __always_inline, this done, pass n (the test number) as a parameter. Previously, only BITS(10) was tested. Add tests for BITS(0) and BITS(31). Link: https://lore.kernel.org/all/20231130102717.1297492-1-mailhol.vincent@xxxxxxxxxx/ v1 -> v2: - Drop the RFC patch. v1 was not ready to be applied on x86 because of pending changes in arch/x86/include/asm/bitops.h. This was finally fixed by Nick in commit 3dae5c43badf ("x86/asm/bitops: Use __builtin_clz{l|ll} to evaluate constant expressions"). Thanks Nick! - Update the commit description. - Introduce the test_const_eval() macro to factorize code. - No functional change. Link: https://lore.kernel.org/all/20221111081316.30373-1-mailhol.vincent@xxxxxxxxxx/ Vincent Mailhol (5): m68k/bitops: force inlining of all bit-find functions m68k/bitops: use __builtin_{clz,ctzl,ffs} to evaluate constant expressions hexagon/bitops: force inlining of all bit-find functions hexagon/bitops: use __builtin_{clz,ctzl,ffs} to evaluate constant expressions lib: test_bitops: add compile-time optimization/evaluations assertions arch/hexagon/include/asm/bitops.h | 25 +++++++++++++++++++----- arch/m68k/include/asm/bitops.h | 26 ++++++++++++++++++------- lib/Kconfig.debug | 4 ++++ lib/test_bitops.c | 32 +++++++++++++++++++++++++++++++ 4 files changed, 75 insertions(+), 12 deletions(-) -- 2.43.0