Re: [PATCH v6 4/5] MCS Lock: Barrier corrections

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Mon, 25 Nov 2013 09:56:16 -0800

On Mon, Nov 25, 2013 at 05:18:23PM +0000, Will Deacon wrote:
> Hi Peter, Linus,
> 
> On Mon, Nov 25, 2013 at 12:09:02PM +0000, Peter Zijlstra wrote:
> > On Sat, Nov 23, 2013 at 12:39:53PM -0800, Linus Torvalds wrote:
> > > On Sat, Nov 23, 2013 at 12:21 PM, Linus Torvalds
> > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > And as far as I can tell, the above gives you: A < B < C < D < E < F <
> > > > A. Which doesn't look possible.
> > > 
> > > Hmm.. I guess technically all of those cases aren't "strictly
> > > precedes" as much as "cannot have happened in the opposite order". So
> > > the "<" might be "<=". Which I guess *is* possible: "it all happened
> > > at the same time". And then the difference between your suggested
> > > "lwsync" and "sync" in the unlock path on CPU0 basically approximating
> > > the difference between "A <= B" and "A < B"..
> > > 
> > > Ho humm.
> > 
> > But remember, there's an actual full proper barrier between E and F, so
> > at best you'd end up with something like:
> > 
> >   A <= B <= C <= D <= E < F <= A
> > 
> > Which is still an impossibility.
> > 
> > I'm hoping others will explain things, as I'm very much on shaky ground
> > myself wrt transitivity.
> 
> The transitivity issues come about by having multiple, valid copies of the
> same data at a given moment in time (hence the term `multi-copy atomicity',
> where all of these copies appear to be updated at once).
> 
> Now, I'm not familiar with the Power memory model and the implementation
> intricacies between lwsync and sync, but I think a better way to think
> about this is to think of the cacheline state changes being broadcast as
> asynchronous requests, rather than necessarily responding to snoops from a
> canonical source.
> 
> So, in Paul's example, the upgrade requests on X and lock (shared -> invalid)
> may have reached CPU1, but not CPU2 by the time CPU2 reads X and therefore
> reads 0 from its shared line. It really depends on the multi-copy semantics
> you give to the different barrier instructions.

Exactly!  ;-)

> The other thing worth noting is that exclusive access instructions (e.g.
> ldrex and strex on ARM) may interact differently with barriers than conventional
> accesses, so lighter weight barriers can sometimes be acceptable for things
> like locks and atomics.
> 
> Does that help at all?

The differences between ldrex/strex and larx/stcx cannot come into play
in this example because there are only normal loads and stores, no atomic
instructions.

							Thanx, Paul

> Will
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>