On Thu, Oct 08, 2015 at 01:16:38PM +0200, Peter Zijlstra wrote: > On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote: > > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote: > > > > Currently, we do need smp_mb__after_unlock_lock() to be after the > > > acquisition on PPC -- putting it between the unlock and the lock > > > of course doesn't cut it for the cross-thread unlock/lock case. > > This ^, that makes me think I don't understand > smp_mb__after_unlock_lock. > > How is: > > UNLOCK x > smp_mb__after_unlock_lock() > LOCK y > > a problem? That's still a full barrier. I thought Paul was talking about something like this case: CPU A CPU B CPU C foo = 1 UNLOCK x LOCK x (RELEASE) bar = 1 ACQUIRE bar = 1 READ_ONCE foo = 0 but this looks the same as ISA2+lwsyncs/ISA2+lwsync+ctrlisync+lwsync, which are both forbidden on PPC, so now I'm also confused. The different-lock, same thread case is more straight-forward, I think. > > > I am with Peter -- we do need the benchmark results for PPC. > > > > Urgh, sorry guys. I have been slowly doing some benchmarks, but time is not > > plentiful at the moment. > > > > If we do a straight lwsync -> sync conversion for unlock it looks like that > > will cost us ~4.2% on Anton's standard context switch benchmark. Thanks Michael! > And that does not seem to agree with Paul's smp_mb__after_unlock_lock() > usage and would not be sufficient for the same (as of yet unexplained) > reason. > > Why does it matter which of the LOCK or UNLOCK gets promoted to full > barrier on PPC in order to become RCsc? I think we need a PPC litmus test illustrating the inter-thread, same lock failure case when smp_mb__after_unlock_lock is not present so that we can reason about this properly. Paul? Will -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html