On Thu, Nov 28, 2013 at 06:03:18PM +0000, Will Deacon wrote: > On Thu, Nov 28, 2013 at 05:38:53PM +0000, Paul E. McKenney wrote: > > On Thu, Nov 28, 2013 at 11:40:59AM +0000, Will Deacon wrote: > > > On Wed, Nov 27, 2013 at 05:11:43PM +0000, Paul E. McKenney wrote: > > > > And in fact the unlock+lock barrier is all that RCU needs. I guess the > > > > question is whether it is worth having two flavors of __after_spinlock(), > > > > one that is a full barrier with just the lock, and another that is > > > > only guaranteed to be a full barrier with unlock+lock. > > > > > > I think it's worth distinguishing those cases because, in my mind, one is > > > potentially a lot heavier than the other. The risk is that we end up > > > producing a set of strangely named barrier abstractions that nobody can > > > figure out how to use properly: > > > > > > > > > /* > > > * Prevent re-ordering of arbitrary accesses across spin_lock and > > > * spin_unlock. > > > */ > > > mb__after_spin_lock() > > > mb__after_spin_unlock() > > > > > > /* > > > * Order spin_lock() vs spin_unlock() > > > */ > > > mb__between_spin_unlock_lock() /* Horrible name! */ > > > > > > > > > We could potentially replace the first set of barriers with spin_lock_mb() > > > and spin_unlock_mb() variants (which would be more efficient than half > > > barrier + full barrier), then we only end up with strangely named barrier > > > which applies to the non _mb() spinlock routines. > > > > How about the current mb__before_spinlock() making the acquisition be > > a full barrier, and an mb_after_spinlock() making a prior release plus > > this acquisition be a full barrier? > > Hmm, without horrible hacks to keep track of whether we've done an > mb__before_spinlock() without a matching spinlock(), that's going to end up > with full-barrier + pointless half-barrier (similarly on the release path). We should be able to detect mb__before_spinlock() without a matching spinlock via static analysis, right? Or am I missing your point? > > Yes, we might need better names, but I believe that this approach does > > what you need. > > > > Thoughts? > > I still think we need to draw the distinction between ordering all accesses > against a lock and ordering an unlock against a lock. The latter is free for > arm64 (STLR => LDAR is ordered) but the former requires a DMB. > > Not sure I completely got your drift... Here is what I am suggesting: o mb__before_spinlock(): o Must appear immediately before a lock acquisition. o Upgrades a lock acquisition to a full barrier. o Emits DMB on ARM64. o mb_after_spinlock(): o Must appear immediatly after a lock acquisition. o Upgrades an unlock+lock pair to a full barrier. o Emits a no-op on ARM64, as in "do { } while (0)". o Might need a separate flavor for queued locks on some platforms, but no sign of that yet. Does that make sense? Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>