On Mon, 24 Feb 2025 13:55:28 -0800 "H. Peter Anvin" <hpa@xxxxxxxxx> wrote: > On 2/24/25 07:24, Uros Bizjak wrote: > > > > > > On 23. 02. 25 17:42, Kuan-Wei Chiu wrote: > >> Refactor parity calculations to use the standard parity8() helper. This > >> change eliminates redundant implementations and improves code > >> efficiency. ... > Of course, on x86, parity8() and parity16() can be implemented very simply: > > (Also, the parity functions really ought to return bool, and be flagged > __attribute_const__.) > > static inline __attribute_const__ bool _arch_parity8(u8 val) > { > bool parity; > asm("and %0,%0" : "=@ccnp" (parity) : "q" (val)); > return parity; > } > > static inline __attribute_const__ bool _arch_parity16(u16 val) > { > bool parity; > asm("xor %h0,%b0" : "=@ccnp" (parity), "+Q" (val)); > return parity; > } The same (with fixes) can be done for parity64() on 32bit. > > In the generic algorithm, you probably should implement parity16() in > terms of parity8(), parity32() in terms of parity16() and so on: > > static inline __attribute_const__ bool parity16(u16 val) > { > #ifdef ARCH_HAS_PARITY16 > if (!__builtin_const_p(val)) > return _arch_parity16(val); > #endif > return parity8(val ^ (val >> 8)); > } > > This picks up the architectural versions when available. Not the best way to do that. Make the name in the #ifdef the same as the function and define a default one if the architecture doesn't define one. So: static inline parity16(u16 val) { return __builtin_const_p(val) ? _parity_const(val) : _parity16(val); } #ifndef _parity16 static inline _parity16(u15 val) { return _parity8(val ^ (val >> 8)); } #endif You only need one _parity_const(). > > Furthermore, if a popcnt instruction is known to exist, then the parity > is simply popcnt(x) & 1. Beware that some popcnt instructions are slow. David > > -hpa > >