Re: [PATCH RFC 08/26] locking: Remove spin_unlock_wait() generic definitions

Will Deacon <will.deacon@xxxxxxx> · Fri, 30 Jun 2017 10:19:29 +0100

On Thu, Jun 29, 2017 at 05:01:16PM -0700, Paul E. McKenney wrote:
> There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> and it appears that all callers could do just as well with a lock/unlock
> pair.  This commit therefore removes spin_unlock_wait() and related
> definitions from core code.
> 
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Cc: Arnd Bergmann <arnd@xxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Will Deacon <will.deacon@xxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> Cc: Andrea Parri <parri.andrea@xxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> ---
>  include/asm-generic/qspinlock.h |  14 -----
>  include/linux/spinlock.h        |  31 -----------
>  include/linux/spinlock_up.h     |   6 ---
>  kernel/locking/qspinlock.c      | 117 ----------------------------------------
>  4 files changed, 168 deletions(-)

[...]

> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> index b2caec7315af..64a9051e4c2c 100644
> --- a/kernel/locking/qspinlock.c
> +++ b/kernel/locking/qspinlock.c
> @@ -267,123 +267,6 @@ static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
>  #define queued_spin_lock_slowpath	native_queued_spin_lock_slowpath
>  #endif
>  
> -/*
> - * Various notes on spin_is_locked() and spin_unlock_wait(), which are
> - * 'interesting' functions:
> - *
> - * PROBLEM: some architectures have an interesting issue with atomic ACQUIRE
> - * operations in that the ACQUIRE applies to the LOAD _not_ the STORE (ARM64,
> - * PPC). Also qspinlock has a similar issue per construction, the setting of
> - * the locked byte can be unordered acquiring the lock proper.
> - *
> - * This gets to be 'interesting' in the following cases, where the /should/s
> - * end up false because of this issue.
> - *
> - *
> - * CASE 1:
> - *
> - * So the spin_is_locked() correctness issue comes from something like:
> - *
> - *   CPU0				CPU1
> - *
> - *   global_lock();			local_lock(i)
> - *     spin_lock(&G)			  spin_lock(&L[i])
> - *     for (i)				  if (!spin_is_locked(&G)) {
> - *       spin_unlock_wait(&L[i]);	    smp_acquire__after_ctrl_dep();
> - *					    return;
> - *					  }
> - *					  // deal with fail
> - *
> - * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked such
> - * that there is exclusion between the two critical sections.
> - *
> - * The load from spin_is_locked(&G) /should/ be constrained by the ACQUIRE from
> - * spin_lock(&L[i]), and similarly the load(s) from spin_unlock_wait(&L[i])
> - * /should/ be constrained by the ACQUIRE from spin_lock(&G).
> - *
> - * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.

Might be worth keeping this comment about spin_is_locked, since we're not
removing that guy just yet!

Will