Re: [PATCH v2 01/18] lib/parity: Add __builtin_parity() fallback implementations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Yury,

On Sat, Mar 01, 2025 at 10:10:20PM -0500, Yury Norov wrote:
> On Sat, Mar 01, 2025 at 10:23:52PM +0800, Kuan-Wei Chiu wrote:
> > Add generic C implementations of __paritysi2(), __paritydi2(), and
> > __parityti2() as fallback functions in lib/parity.c. These functions
> > compute the parity of a given integer using a bitwise approach and are
> > marked with __weak, allowing architecture-specific implementations to
> > override them.
> > 
> > This patch serves as preparation for using __builtin_parity() by
> > ensuring a fallback mechanism is available when the compiler does not
> > inline the __builtin_parity().
> > 
> > Co-developed-by: Yu-Chun Lin <eleanor15x@xxxxxxxxx>
> > Signed-off-by: Yu-Chun Lin <eleanor15x@xxxxxxxxx>
> > Signed-off-by: Kuan-Wei Chiu <visitorckw@xxxxxxxxx>
> > ---
> >  lib/Makefile |  2 +-
> >  lib/parity.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 49 insertions(+), 1 deletion(-)
> >  create mode 100644 lib/parity.c
> > 
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 7bab71e59019..45affad85ee4 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -51,7 +51,7 @@ obj-y += bcd.o sort.o parser.o debug_locks.o random32.o \
> >  	 bsearch.o find_bit.o llist.o lwq.o memweight.o kfifo.o \
> >  	 percpu-refcount.o rhashtable.o base64.o \
> >  	 once.o refcount.o rcuref.o usercopy.o errseq.o bucket_locks.o \
> > -	 generic-radix-tree.o bitmap-str.o
> > +	 generic-radix-tree.o bitmap-str.o parity.o
> >  obj-y += string_helpers.o
> >  obj-y += hexdump.o
> >  obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
> > diff --git a/lib/parity.c b/lib/parity.c
> > new file mode 100644
> > index 000000000000..a83ff8d96778
> > --- /dev/null
> > +++ b/lib/parity.c
> > @@ -0,0 +1,48 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * lib/parity.c
> > + *
> > + * Copyright (C) 2025 Kuan-Wei Chiu <visitorckw@xxxxxxxxx>
> > + * Copyright (C) 2025 Yu-Chun Lin <eleanor15x@xxxxxxxxx>
> > + *
> > + * __parity[sdt]i2 can be overridden by linking arch-specific versions.
> > + */
> > +
> > +#include <linux/export.h>
> > +#include <linux/kernel.h>
> > +
> > +/*
> > + * One explanation of this algorithm:
> > + * https://funloop.org/codex/problem/parity/README.html
> 
> I already asked you not to spread this link. Is there any reason to
> ignore it?
> 
In v2, this algorithm was removed from bitops.h, so I moved the link
here instead. I'm sorry if it seemed like I ignored your comment.

> > + */
> > +int __weak __paritysi2(u32 val);
> > +int __weak __paritysi2(u32 val)
> > +{
> > +	val ^= val >> 16;
> > +	val ^= val >> 8;
> > +	val ^= val >> 4;
> > +	return (0x6996 >> (val & 0xf)) & 1;
> > +}
> > +EXPORT_SYMBOL(__paritysi2);
> > +
> > +int __weak __paritydi2(u64 val);
> > +int __weak __paritydi2(u64 val)
> > +{
> > +	val ^= val >> 32;
> > +	val ^= val >> 16;
> > +	val ^= val >> 8;
> > +	val ^= val >> 4;
> > +	return (0x6996 >> (val & 0xf)) & 1;
> > +}
> > +EXPORT_SYMBOL(__paritydi2);
> > +
> > +int __weak __parityti2(u64 val);
> > +int __weak __parityti2(u64 val)
> > +{
> > +	val ^= val >> 32;
> > +	val ^= val >> 16;
> > +	val ^= val >> 8;
> > +	val ^= val >> 4;
> > +	return (0x6996 >> (val & 0xf)) & 1;
> > +}
> > +EXPORT_SYMBOL(__parityti2);
> 
> OK, it seems I wasn't clear enough on the previous round, so I'll try
> to be very straightforward now.
> 
> To begin with, the difference between __parityti2 and __paritydi2 
> doesn't exist. I'm seriously. I put them side by side, and there's
> no difference at all.
> 
> Next, this all is clearly an overengineering. You bake all those weak,
> const and (ironically, missing) high-efficient arch implementations.
> But you show no evidence that:
>  - it improves on code generation,
>  - the drivers care about parity()'s performance, and
>  - show no perf tests at all.
> 
> So you end up with +185/-155 LOCs.
> 
> Those +30 lines add no new functionality. You copy-paste the same
> algorithm again and again in very core kernel files. This is a no-go
> for a nice consolidation series.
> 
> In the previous round reviewers gave you quite a few nice suggestions.
> H. Peter Anvin suggested to switch the function to return a boolean, I
> suggested to make it a macro and even sent you a patch, Jiri and David
> also spent their time trying to help you, and became ignored.
> 
> Nevertheless. NAK for the series. Whatever you end up, if it comes to
> v3, please make it simple, avoid code duplication and run checkpatch.
> 
In v1, I used the same approach as parity8() because I couldn't justify
the performance impact in a specific driver or subsystem. However,
multiple people commented on using __builtin_parity or an x86 assembly
implementation. I'm not ignoring their feedback-I want to address these
comments. Before submitting, I sent an email explaining my current
approach: using David's suggested method along with __builtin_parity,
but no one responded. So, I decided to submit v2 for discussion
instead.

To avoid mistakes in v3, I want to confirm the following changes before
sending it:

(a) Change the return type from int to bool.
(b) Avoid __builtin_parity and use the same approach as parity8().
(c) Implement parity16/32/64() as single-line inline functions that
    call the next smaller variant after xor.
(d) Add a parity() macro that selects the appropriate parityXX() based
    on type size.
(e) Update users to use this parity() macro.

However, (d) may require a patch affecting multiple subsystems at once
since some places that already include bitops.h have functions named
parity(), causing conflicts. Unless we decide not to add this macro in
the end.

As for checkpatch.pl warnings, they are mostly pre-existing coding
style issues in this series. I've kept them as-is, but if preferred,
I'm fine with fixing them.

If anything is incorrect or if there are concerns, please let me know.

Regards,
Kuan-Wei

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index c1cb53cf2f0f..47b7eca8d3b7 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -260,6 +260,43 @@ static inline int parity8(u8 val)
 	return (0x6996 >> (val & 0xf)) & 1;
 }

+static inline bool parity16(u16 val)
+{
+	return parity8(val ^ (val >> 8));
+}
+
+static inline bool parity32(u32 val)
+{
+	return parity16(val ^ (val >> 16));
+}
+
+static inline bool parity64(u64 val)
+{
+	return parity32(val ^ (val >> 32));
+}
+
+#define parity(val)			\
+({					\
+	bool __ret;			\
+	switch (BITS_PER_TYPE(val)) {	\
+	case 64:			\
+		__ret = parity64(val);	\
+		break;			\
+	case 32:			\
+		__ret = parity32(val);	\
+		break;			\
+	case 16:			\
+		__ret = parity16(val);	\
+		break;			\
+	case 8:				\
+		__ret = parity8(val);	\
+		break;			\
+	default:			\
+		BUILD_BUG();		\
+	}				\
+	__ret;				\
+})
+
 /**
  * __ffs64 - find first set bit in a 64 bit word
  * @word: The 64 bit word





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux