Re: [PATCH v6 4/5] MCS Lock: Barrier corrections

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 20, 2013 at 10:43:46AM -0800, Tim Chen wrote:
> On Wed, 2013-11-20 at 09:14 -0800, Paul E. McKenney wrote:
> > On Wed, Nov 20, 2013 at 03:46:43PM +0000, Will Deacon wrote:
> > > Hi Paul,
> > > 
> > > On Wed, Nov 20, 2013 at 03:31:23PM +0000, Paul E. McKenney wrote:
> > > > On Tue, Nov 19, 2013 at 05:37:43PM -0800, Tim Chen wrote:
> > > > > @@ -68,7 +72,12 @@ void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> > > > >  		while (!(next = ACCESS_ONCE(node->next)))
> > > > >  			arch_mutex_cpu_relax();
> > > > >  	}
> > > > > -	ACCESS_ONCE(next->locked) = 1;
> > > > > -	smp_wmb();
> > > > > +	/*
> > > > > +	 * Pass lock to next waiter.
> > > > > +	 * smp_store_release() provides a memory barrier to ensure
> > > > > +	 * all operations in the critical section has been completed
> > > > > +	 * before unlocking.
> > > > > +	 */
> > > > > +	smp_store_release(&next->locked, 1);
> > > > 
> > > > However, there is one problem with this that I missed yesterday.
> > > > 
> > > > Documentation/memory-barriers.txt requires that an unlock-lock pair
> > > > provide a full barrier, but this is not guaranteed if we use
> > > > smp_store_release() for unlock and smp_load_acquire() for lock.
> > > > At least one of these needs a full memory barrier.
> > > 
> > > Hmm, so in the following case:
> > > 
> > >   Access A
> > >   unlock()	/* release semantics */
> > >   lock()	/* acquire semantics */
> > >   Access B
> > > 
> > > A cannot pass beyond the unlock() and B cannot pass the before the lock().
> > > 
> > > I agree that accesses between the unlock and the lock can be move across both
> > > A and B, but that doesn't seem to matter by my reading of the above.
> > > 
> > > What is the problematic scenario you have in mind? Are you thinking of the
> > > lock() moving before the unlock()? That's only permitted by RCpc afaiu,
> > > which I don't think any architectures supported by Linux implement...
> > > (ARMv8 acquire/release is RCsc).
> > 
> > If smp_load_acquire() and smp_store_release() are both implemented using
> > lwsync on powerpc, and if Access A is a store and Access B is a load,
> > then Access A and Access B can be reordered.
> > 
> > Of course, if every other architecture will be providing RCsc implementations
> > for smp_load_acquire() and smp_store_release(), which would not be a bad
> > thing, then another approach is for powerpc to use sync rather than lwsync
> > for one or the other of smp_load_acquire() or smp_store_release().
> 
> Can we count on the xchg function in the beginning of mcs_lock to
> provide a memory barrier? It should provide an implicit memory
> barrier according to the memory-barriers document.

The problem with the implicit full barrier associated with the xchg()
function is that it is in the wrong place if the lock is contended.
We need to ensure that the previous lock holder's critical section
is seen by everyone to precede that of the next lock holder, and
we need transitivity.  The only operations that are in the right place
to force the needed ordering in the contended case are those involved
in the lock handoff.  :-(

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]