On February 25, 2025 2:46:23 PM PST, David Laight <david.laight.linux@xxxxxxxxx> wrote: >On Mon, 24 Feb 2025 13:55:28 -0800 >"H. Peter Anvin" <hpa@xxxxxxxxx> wrote: > >> On 2/24/25 07:24, Uros Bizjak wrote: >> > >> > >> > On 23. 02. 25 17:42, Kuan-Wei Chiu wrote: >> >> Refactor parity calculations to use the standard parity8() helper. This >> >> change eliminates redundant implementations and improves code >> >> efficiency. >... >> Of course, on x86, parity8() and parity16() can be implemented very simply: >> >> (Also, the parity functions really ought to return bool, and be flagged >> __attribute_const__.) >> >> static inline __attribute_const__ bool _arch_parity8(u8 val) >> { >> bool parity; >> asm("and %0,%0" : "=@ccnp" (parity) : "q" (val)); >> return parity; >> } >> >> static inline __attribute_const__ bool _arch_parity16(u16 val) >> { >> bool parity; >> asm("xor %h0,%b0" : "=@ccnp" (parity), "+Q" (val)); >> return parity; >> } > >The same (with fixes) can be done for parity64() on 32bit. > >> >> In the generic algorithm, you probably should implement parity16() in >> terms of parity8(), parity32() in terms of parity16() and so on: >> >> static inline __attribute_const__ bool parity16(u16 val) >> { >> #ifdef ARCH_HAS_PARITY16 >> if (!__builtin_const_p(val)) >> return _arch_parity16(val); >> #endif >> return parity8(val ^ (val >> 8)); >> } >> >> This picks up the architectural versions when available. > >Not the best way to do that. >Make the name in the #ifdef the same as the function and define >a default one if the architecture doesn't define one. >So: > >static inline parity16(u16 val) >{ > return __builtin_const_p(val) ? _parity_const(val) : _parity16(val); >} > >#ifndef _parity16 >static inline _parity16(u15 val) >{ > return _parity8(val ^ (val >> 8)); >} >#endif > >You only need one _parity_const(). > >> >> Furthermore, if a popcnt instruction is known to exist, then the parity >> is simply popcnt(x) & 1. > >Beware that some popcnt instructions are slow. > > David > >> >> -hpa >> >> > Seems more verbose than just #ifdef _arch_parity8 et al since the const and generic code cases are the same (which they aren't always.) But that part is a good idea, especially since on at least *some* architectures like x86 doing: #define _arch_parity8(x) __builtin_parity(x) ... etc is entirely reasonable and lets gcc use an already available parity flag should one be available. The inline wrapper, of course, takes care of the type mangling.