On Thu, Apr 09, 2015 at 09:57:21PM +0200, Peter Zijlstra wrote: > On Mon, Apr 06, 2015 at 10:55:48PM -0400, Waiman Long wrote: > > > @@ -219,24 +236,30 @@ static void pv_wait_node(struct mcs_spinlock *node) > > } > > > > /* > > + * Called after setting next->locked = 1 & lock acquired. > > + * Check if the the CPU has been halted. If so, set the _Q_SLOW_VAL flag > > + * and put an entry into the lock hash table to be waken up at unlock time. > > */ > > -static void pv_kick_node(struct mcs_spinlock *node) > > +static void pv_scan_next(struct qspinlock *lock, struct mcs_spinlock *node) > > I'm not too sure about that name change.. > > > { > > struct pv_node *pn = (struct pv_node *)node; > > + struct __qspinlock *l = (void *)lock; > > > > /* > > + * Transition CPU state: halted => hashed > > + * Quit if the transition failed. > > */ > > + if (cmpxchg(&pn->state, vcpu_halted, vcpu_hashed) != vcpu_halted) > > + return; > > + > > + /* > > + * Put the lock into the hash table & set the _Q_SLOW_VAL in the lock. > > + * As this is the same CPU that will check the _Q_SLOW_VAL value and > > + * the hash table later on at unlock time, no atomic instruction is > > + * needed. > > + */ > > + WRITE_ONCE(l->locked, _Q_SLOW_VAL); > > + (void)pv_hash(lock, pn); > > } > > This is broken. The unlock path relies on: > > pv_hash() > MB > l->locked = SLOW > > such that when it observes SLOW, it must then also observe a consistent > bucket. > > The above can have us do pv_hash_find() _before_ we actually hash the > lock, which will result in us triggering that BUG_ON() in there. Urgh, clearly its late and I cannot read. The comment explains it. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html