On 05/05/17 19:32, Andrew Haley wrote: > On 05/05/17 18:26, Toebs Douglass wrote: >> I may be wrong, but I think you might be thinking about store barriers >> being built into atomic operations; if so, I don't mean that in what I >> say. I mean that a bare atomic operation (LL/SC / LOCK), bereft of >> memory barriers, forces a store to memory, and that this in turn forces >> the honouring of earlier store barriers. > > I know I said I'd reached the end, but I have to point out that this > is certainly wrong. > A CAS is a Processor-local operation, and does not need to access > memory. The cache coherence protocol is sufficient. Yes. This is so. However, in and of itself it doesn't matter for the example I described; all that matter is that *a* store completes - and the store which completes can have absolutely nothing to do with the stores which occurred prior to the earlier store barrier. All that matters is that it happens. > And a CAS > certainly is not guaranteed to do anything with any other stores. > On some processors it will; others not. A store barrier issued prior to the CAS will impose an ordering constraint such that all store prior to the store barrier must complete prior to any store after the barrier. (The store barrier itself however specifically does not cause any stores to complete.) So we can issue a store barrier (which itself does not complete any stores) and then *later* issue a CAS. Because the CAS forces a store, the stores earlier than the store barrier MUST now complete - or we violate the constraint imposed by the store barrier.