On Tue, Feb 26, 2019 at 06:26:24PM +0000, Will Deacon wrote: > On Fri, Feb 22, 2019 at 01:49:32PM -0800, Linus Torvalds wrote: > So I *am* using __this_cpu_xchg() here, which means the architecture can > get away with plain old loads and stores (which is what RISC-V does, for > example), but I see that's not the case on e.g. x86 so I'll rework using > read() and write() because it doesn't hurt. Right, so the problem on x86 is that XCHG has an implicit LOCK prefix, so there no !atomic variant. So even the cpu-local xchg gets the LOCK prefix penalty, even though all we really wanted is a single instruction. Arguably we could fix that for __this_cpu_xchg(), which isn't IRQ-safe.