On Mon, Feb 6, 2023, at 12:24, Peter Zijlstra wrote: > On Fri, Feb 03, 2023 at 06:25:04PM +0100, Arnd Bergmann wrote: >> Unless I have misunderstood what you are doing, my concerns are >> still the same: >> >> > #define this_cpu_cmpxchg(pcp, oval, nval) \ >> > - __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) >> > + __pcpu_size16_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) >> > #define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, >> > nval2) \ >> > __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, >> > oval1, oval2, nval1, nval2) >> >> Having a variable-length this_cpu_cmpxchg() that turns into cmpxchg128() >> and cmpxchg64() even on CPUs where this traps (!X86_FEATURE_CX16) seems >> like a bad design to me. >> >> I would much prefer fixed-length this_cpu_cmpxchg64()/this_cpu_cmpxchg128() >> calls that never trap but fall back to the generic version on CPUs that >> are lacking the atomics. > > You're thinking acidental usage etc..? Lemme see what I can do. I wouldn't even call it accidental when the dependency is so subtle: Having to call system_has_cmpxchg64() beforce calling cmpxchg64() is already somewhat awkward but has some logic to it. Having to call system_has_cmpxchg64()/system_has_cmpxchg128() before calling this_cpu_cmpxchg() depending on the argument size on architectures that sometimes have cmpxchg128 but not on architectures that always have it or that never have it makes it useless as an abstraction. Arnd