Re: [PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Thu, 9 Apr 2015 21:57:21 +0200

On Mon, Apr 06, 2015 at 10:55:48PM -0400, Waiman Long wrote:

> @@ -219,24 +236,30 @@ static void pv_wait_node(struct mcs_spinlock *node)
>  }
>  
>  /*
> + * Called after setting next->locked = 1 & lock acquired.
> + * Check if the the CPU has been halted. If so, set the _Q_SLOW_VAL flag
> + * and put an entry into the lock hash table to be waken up at unlock time.
>   */
> -static void pv_kick_node(struct mcs_spinlock *node)
> +static void pv_scan_next(struct qspinlock *lock, struct mcs_spinlock *node)

I'm not too sure about that name change..

>  {
>  	struct pv_node *pn = (struct pv_node *)node;
> +	struct __qspinlock *l = (void *)lock;
>  
>  	/*
> +	 * Transition CPU state: halted => hashed
> +	 * Quit if the transition failed.
>  	 */
> +	if (cmpxchg(&pn->state, vcpu_halted, vcpu_hashed) != vcpu_halted)
> +		return;
> +
> +	/*
> +	 * Put the lock into the hash table & set the _Q_SLOW_VAL in the lock.
> +	 * As this is the same CPU that will check the _Q_SLOW_VAL value and
> +	 * the hash table later on at unlock time, no atomic instruction is
> +	 * needed.
> +	 */
> +	WRITE_ONCE(l->locked, _Q_SLOW_VAL);
> +	(void)pv_hash(lock, pn);
>  }

This is broken. The unlock path relies on:

  pv_hash()
   MB
  l->locked = SLOW

such that when it observes SLOW, it must then also observe a consistent
bucket.

The above can have us do pv_hash_find() _before_ we actually hash the
lock, which will result in us triggering that BUG_ON() in there.
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html