On 2020-05-18, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> smp_mb(); /* LMM(data_push_tail:C) */ >> >> + if (atomic_long_try_cmpxchg_relaxed(&data_ring->tail_lpos, >> + &tail_lpos, >> + next_lpos)) { /* LMM(data_push_tail:D) */ >> + break; >> + } > > Doing an "smp_mb()" followed by a "cmpxchg_relaxed" seems all kinds of > odd and pointless, and is very much non-optimal on x86 for example., > > Just remove the smp_mb(), and use the non-relaxed form of cmpxchg. > It's defined to be fully ordered if it succeeds (and if the cmpxchg > doesn't succeed, it's a no-op and the memory barrier shouldn't make > any difference). > > Otherwise you'll do two memory ordering operations on x86 (and > probably some other architectures), since the cmpxchg is always > ordered on x86 and there exists no "relaxed" form of it. ACK. All three smp_mb()'s and both smp_wmb()'s sit directly next to cmpxchg_relaxed() calls. Having explicit memory barriers was helpful for identifying, proving, and testing a minimal set of pairs (on arm64), but all will be folded into full cmpxchg()'s for the next version. John Ogness _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec