The quilt patch titled Subject: bitops: optimize fns() for improved performance has been removed from the -mm tree. Its filename was bitops-optimize-fns-for-improved-performance.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Kuan-Wei Chiu <visitorckw@xxxxxxxxx> Subject: bitops: optimize fns() for improved performance Date: Thu, 2 May 2024 17:24:43 +0800 The current fns() repeatedly uses __ffs() to find the index of the least significant bit and then clears the corresponding bit using __clear_bit(). The method for clearing the least significant bit can be optimized by using word &= word - 1 instead. Typically, the execution time of one __ffs() plus one __clear_bit() is longer than that of a bitwise AND operation and a subtraction. To improve performance, the loop for clearing the least significant bit has been replaced with word &= word - 1, followed by a single __ffs() operation to obtain the answer. This change reduces the number of __ffs() iterations from n to just one, enhancing overall performance. This modification significantly accelerates the fns() function in the test_bitops benchmark, improving its speed by approximately 7.6 times. Additionally, it enhances the performance of find_nth_bit() in the find_bit benchmark by approximately 26%. Before: test_bitops: fns: 58033164 ns find_nth_bit: 4254313 ns, 16525 iterations After: test_bitops: fns: 7637268 ns find_nth_bit: 3362863 ns, 16501 iterations Link: https://lkml.kernel.org/r/20240502092443.6845-3-visitorckw@xxxxxxxxx Signed-off-by: Kuan-Wei Chiu <visitorckw@xxxxxxxxx> Cc: Ching-Chun (Jim) Huang <jserv@xxxxxxxxxxxxxxxx> Cc: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> Cc: Yury Norov <yury.norov@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/bitops.h | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) --- a/include/linux/bitops.h~bitops-optimize-fns-for-improved-performance +++ a/include/linux/bitops.h @@ -254,16 +254,10 @@ static inline unsigned long __ffs64(u64 */ static inline unsigned long fns(unsigned long word, unsigned int n) { - unsigned int bit; + while (word && n--) + word &= word - 1; - while (word) { - bit = __ffs(word); - if (n-- == 0) - return bit; - __clear_bit(bit, &word); - } - - return BITS_PER_LONG; + return word ? __ffs(word) : BITS_PER_LONG; } /** _ Patches currently in -mm which might be from visitorckw@xxxxxxxxx are