On Thu, Feb 2, 2023, at 15:50, Peter Zijlstra wrote: > In order to replace cmpxchg_double() with the newly minted > cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> I commented on this in the previous version but never got any reply from you: https://lore.kernel.org/all/1d88ba9f-5541-4b67-9cc8-a361eef36547@xxxxxxxxxxxxxxxx/ Unless I have misunderstood what you are doing, my concerns are still the same: > #define this_cpu_cmpxchg(pcp, oval, nval) \ > - __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) > + __pcpu_size16_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) > #define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, > nval2) \ > __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, > oval1, oval2, nval1, nval2) Having a variable-length this_cpu_cmpxchg() that turns into cmpxchg128() and cmpxchg64() even on CPUs where this traps (!X86_FEATURE_CX16) seems like a bad design to me. I would much prefer fixed-length this_cpu_cmpxchg64()/this_cpu_cmpxchg128() calls that never trap but fall back to the generic version on CPUs that are lacking the atomics. Arnd