On Mon, Dec 19, 2022, at 16:35, Peter Zijlstra wrote: > In order to replace cmpxchg_double() with the newly minted > cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Does this work on x86 chips without X86_FEATURE_CX16? As far as I can tell, the new percpu_cmpxchg128_op uses the cmpxchg16b instruction unconditionally without checking for the feature bit first, and is now used by this_cpu_cmpxchg() unconditionally as well. This is fine for the moment if the only user is mm/slub.c and that retains the system_has_cmpxchg128() runtime check, but I think a better interface would be to guarantee that this_cpu_cmpxchg() always ends up either in a working inline asm or the generic fallback but not an invalid opcode. For consistency, I would also suggest this_cpu_cmpxchg() to take the same argument types as cmpxchg(): at most 'unsigned long' sized, with additional this_cpu_cmpxchg64() and this_cpu_cmpxchg128() macros that take fixed-size arguments. I have an older patch set that additionally converts all 8-bit and 16-bit cmpxchg()/xchg() calls to cmpxchg_8()/ xchg_8()/cmpxchg_16()/xchg_16() and and leaves only the fixed-32bit and variable typed 'unsigned long' sized callers for the weakly typed variant. Arnd