On Tue, Dec 20, 2022 at 08:31:19AM -0600, Linus Torvalds wrote: > On Tue, Dec 20, 2022 at 5:09 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > On Mon, Dec 19, 2022 at 12:07:25PM -0800, Boqun Feng wrote: > > > > > > I wonder whether we should use "(*(u128 *)ptr)" instead of "(*(unsigned > > > long *) ptr)"? Because compilers may think only 64bit value pointed by > > > "ptr" gets modified, and they are allowed to do "useful" optimization. > > > > In this I've copied the existing cmpxchg_double() code; I'll have to let > > the arch folks speak here, I've no clue. > > It does sound like the right thing to do. I doubt it ends up making a > difference in practice, but yes, the asm doesn't have a memory > clobber, so the input/output types should be the right ones for the > compiler to not possibly do something odd and cache the part that it > doesn't see as being accessed. Right, and x86 does just *ptr, without trying to cast away the volatile even. I've pushed out a *(u128 *)ptr variant for arm64 and s390, then at least we'll know if the compiler objects.