Re: Memory model release/acquire mode interactions of relaxed atomic operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/05/17 21:18, Toebs Douglass wrote:
> On 04/05/17 20:04, Andrew Haley wrote:
>> On 04/05/17 16:52, Toebs Douglass wrote:
>>> On 04/05/17 16:21, Andrew Haley wrote:
>>>> Either works.  The mappings from C++ atomics to processors are here:
>>>>
>>>> https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
>>>
>>> Ah, that is interesting, and makes a lot of sense.
>>>
>>> For SC, it's atomic.  For everything else, not - which means for
>>> everything else, although ordering is of course guaranteed, visibility
>>> is not, and we rely on the processor doing something "in a reasonable
>>> time" (which might for example be long enough that things break).
>>
>> Umm, what?  All access modes are atomic.
> 
> We have to be careful here, because we may have different ideas about
> what atomic means.  I may be completely wrong, but I think I understand
> memory barriers and atomic operations, but I think I do not always use
> formal terms exactly as they are used in the field.
> 
> So, here, I am not sure what you mean by atomic.
> 
> I do think though that everything other than SC is not atomic (as I use
> the word).  This means, for example, that the store may *never* be seen
> by any other core.  In practise I'm sure this doesn't happen, but there
> are no guarantees, and I think for code to be always correct, the
> assumption has to be made that it is so.

Well, you're going to have to stop using the "word" atomic to mean
that if you don't want to have very confusing conversations.  In terms
of correctness, weaker modes such as acquire/release can be just as
useful.  Sequential consistency is easier to reason about, and that's
its great benefit.

>>>>> I've just had a bit of an online search, as best I could, through the
>>>>> GCC source code.  It looks like expand_atomic_store() does use an atomic
>>>>> exchange or atomic CAS.
>>>>
>>>> That depends on your machine.  On mine (ARMv8) a seq.cst store uses stlr.
>>>
>>> I'm surprised.  I would expect that to be able to fail (because of the
>>> "reasonable time").  I don't know much about ARM though (or about Intel,
>>> for that matter :-)
>>
>> Eventually the processor will be pre-empted for some reason or the
>> cache line which contains the store will be flushed because of another
>> access, but it could be a long wait.  I've seen delays of thousands
>> of instructions, but it could be longer.
> 
> This - cache line flushing - is not the issue I have in mind, for to
> have reached a cache, the data in question will then be participating in
> the MESI protocol, and so it will be visible to other processors.
>
> The problem I have in mind is store buffers and that store barriers do
> not cause stores to complete. 

StoreLoad barriers do indeed flush the store buffer.  That's their
job.  (Or at least they must act as though they do, according to the
synchronization rules.  I don't know of any processor on which
StoreLoad doesn't flush the store buffer, but I suppose it's possible
that somebody could invent another way to do it.)

> The processor performing the store will think it has issued the
> store and see the world accordingly *prior even to the store
> reaching the first level cache*, and there is no guarantee about how
> long this state of affairs persists.
>
> So if we perform a store and then a store barrier,

What exactly do you mean by "a store barrier"?

> we have nothing - there is no guarantee any other core has seen this
> store or ever will.  We only have a guarantee that IF a store
> *after* the store barrier is going to complete, all stores prior to
> the barrier will be forced to actually complete, so that they
> complete first (and thus honour the store barrier).

We don't know how long it will take for loads an stores to propagate.
But if Processor A does a store with some kind of synchronization
operation and then Processor B then performs an action which is
synchronized after Processor A's store, then I can absolutely
guarantee that Processor B has seen Processor A's store.

Unless Processor B performs some kind of synchroniation action, there
is indeed no guarantee that it'll see Processor A's store.

> In other words, all stores which do not use LL/SC or LOCK (or equivalent
> thereof) can in effect never occur.

There's nothing special about LL/SC, compare-and-exchange, or
whatever.  It has its part in the synchronization order, just like
anything else.

Andrew.



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux