On Tue, Jun 07, 2016 at 08:45:53PM +0800, Boqun Feng wrote: > On Tue, Jun 07, 2016 at 02:00:16PM +0200, Peter Zijlstra wrote: > > On Tue, Jun 07, 2016 at 07:43:15PM +0800, Boqun Feng wrote: > > > On Mon, Jun 06, 2016 at 06:08:36PM +0200, Peter Zijlstra wrote: > > > > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c > > > > index ce2f75e32ae1..e1c29d352e0e 100644 > > > > --- a/kernel/locking/qspinlock.c > > > > +++ b/kernel/locking/qspinlock.c > > > > @@ -395,6 +395,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) > > > > * pending stuff. > > > > * > > > > * p,*,* -> n,*,* > > > > + * > > > > + * RELEASE, such that the stores to @node must be complete. > > > > */ > > > > old = xchg_tail(lock, tail); > > > > next = NULL; > > > > @@ -405,6 +407,15 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) > > > > */ > > > > if (old & _Q_TAIL_MASK) { > > > > prev = decode_tail(old); > > > > + /* > > > > + * The above xchg_tail() is also load of @lock which generates, > > > > + * through decode_tail(), a pointer. > > > > + * > > > > + * The address dependency matches the RELEASE of xchg_tail() > > > > + * such that the access to @prev must happen after. > > > > + */ > > > > + smp_read_barrier_depends(); > > > > > > Should this barrier be put before decode_tail()? Because it's the > > > dependency old -> prev that we want to protect here. > > > > I don't think it matters one way or the other. The old->prev > > transformation is pure; it doesn't depend on any state other than old. > > > > But wouldn't the old -> prev transformation get broken on Alpha > semantically in theory? Because Alpha could reorder the LOAD part of the > xchg_tail() and decode_tail(), which results in prev points to other > cpu's mcs_node. No; I don't think this is possible. The thing Alpha needs the barrier for is to force its two cache halves into sync, such that dependent loads are guaranteed ordered. There is only a single load of @old, there is no second load; the transform into @prev is a 'pure' function: https://en.wikipedia.org/wiki/Pure_function (even if the transformation needs to load data; that data is static, it never changes, and therefore it is impossible to observe a stale value). > Though, this is fine in current code, because xchg_release() is actually > xchg() on Alpha, and Alpha doesn't use qspinlock. I actually have a patch that converts Alpha to use relaxed atomics :-) But yeah, the point is entirely moot for Alpha not using qspinlock. > > I put it between prev and dereferences of prev, because that's what made > > most sense to me; but really anywhere between the load of @old and the > > first dereference of @prev is fine I suspect. > > I understand the barrier here serves more for a documentation purpose, > and I don't want to a paranoid ;-) I'm fine with the current place, just > thought we could put it at somewhere more conforming to our memory > model. I think it is in accordance, there is the load of @old and there are the loads of dereferenced @prev, between those a barrier needs to be placed. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html