On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote: > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote: > > Currently, we do need smp_mb__after_unlock_lock() to be after the > > acquisition on PPC -- putting it between the unlock and the lock > > of course doesn't cut it for the cross-thread unlock/lock case. This ^, that makes me think I don't understand smp_mb__after_unlock_lock. How is: UNLOCK x smp_mb__after_unlock_lock() LOCK y a problem? That's still a full barrier. > > I am with Peter -- we do need the benchmark results for PPC. > > Urgh, sorry guys. I have been slowly doing some benchmarks, but time is not > plentiful at the moment. > > If we do a straight lwsync -> sync conversion for unlock it looks like that > will cost us ~4.2% on Anton's standard context switch benchmark. And that does not seem to agree with Paul's smp_mb__after_unlock_lock() usage and would not be sufficient for the same (as of yet unexplained) reason. Why does it matter which of the LOCK or UNLOCK gets promoted to full barrier on PPC in order to become RCsc? -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html