Hi Peter, On Tue, Feb 04, 2014 at 04:43:08PM +0000, Peter Zijlstra wrote: > On Tue, Feb 04, 2014 at 12:29:12PM +0000, Will Deacon wrote: > > @@ -112,17 +114,20 @@ static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new) > > unsigned long tmp; > > int oldval; > > > > + smp_mb(); > > + > > asm volatile("// atomic_cmpxchg\n" > > -"1: ldaxr %w1, %2\n" > > +"1: ldxr %w1, %2\n" > > " cmp %w1, %w3\n" > > " b.ne 2f\n" > > -" stlxr %w0, %w4, %2\n" > > +" stxr %w0, %w4, %2\n" > > " cbnz %w0, 1b\n" > > "2:" > > : "=&r" (tmp), "=&r" (oldval), "+Q" (ptr->counter) > > : "Ir" (old), "r" (new) > > : "cc", "memory"); > > > > + smp_mb(); > > return oldval; > > } > > > > Any particular reason atomic_cmpxchg() doesn't use the proposed rel + mb > scheme? It would be a waste to have atomic_cmpxchg() be more expensive > than it needs to be. Catalin mentioned this to me in the past, and I got hung up on providing full barrier semantics in the case that the comparison fails and we don't perform the release. If we make your change, ldxr cmp b.ne stlxr cbnz dmb ish which is basically just removing the first smp_mb(), then we allow the load component of a failing cmpxchg to be speculated, which would affect code doing: /* Spin waiting for some flag to clear */ while (atomic_read(&flag)); /* Now do the cmpxchg, which will fail and return the old value */ return cmpxchg(...); The cmpxchg could read old data and fail the comparison, because it speculated the ldxr before the reading of flag. Is that a problem? Will -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html