This is a note to let you know that I've just added the patch titled locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait() to the 4.5-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: locking-qspinlock-fix-spin_is_locked-and-spin_unlock_wait.patch and it can be found in the queue-4.5 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 54cf809b9512be95f53ed4a5e3b631d1ac42f0fa Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Date: Fri, 20 May 2016 18:04:36 +0200 Subject: locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait() From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> commit 54cf809b9512be95f53ed4a5e3b631d1ac42f0fa upstream. Similar to commits: 51d7d5205d33 ("powerpc: Add smp_mb() to arch_spin_is_locked()") d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers") qspinlock suffers from the fact that the _Q_LOCKED_VAL store is unordered inside the ACQUIRE of the lock. And while this is not a problem for the regular mutual exclusive critical section usage of spinlocks, it breaks creative locking like: spin_lock(A) spin_lock(B) spin_unlock_wait(B) if (!spin_is_locked(A)) do_something() do_something() In that both CPUs can end up running do_something at the same time, because our _Q_LOCKED_VAL store can drop past the spin_unlock_wait() spin_is_locked() loads (even on x86!!). To avoid making the normal case slower, add smp_mb()s to the less used spin_unlock_wait() / spin_is_locked() side of things to avoid this problem. Reported-and-tested-by: Davidlohr Bueso <dave@xxxxxxxxxxxx> Reported-by: Giovanni Gherdovich <ggherdovich@xxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- include/asm-generic/qspinlock.h | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -28,7 +28,30 @@ */ static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { - return atomic_read(&lock->val); + /* + * queued_spin_lock_slowpath() can ACQUIRE the lock before + * issuing the unordered store that sets _Q_LOCKED_VAL. + * + * See both smp_cond_acquire() sites for more detail. + * + * This however means that in code like: + * + * spin_lock(A) spin_lock(B) + * spin_unlock_wait(B) spin_is_locked(A) + * do_something() do_something() + * + * Both CPUs can end up running do_something() because the store + * setting _Q_LOCKED_VAL will pass through the loads in + * spin_unlock_wait() and/or spin_is_locked(). + * + * Avoid this by issuing a full memory barrier between the spin_lock() + * and the loads in spin_unlock_wait() and spin_is_locked(). + * + * Note that regular mutual exclusion doesn't care about this + * delayed store. + */ + smp_mb(); + return atomic_read(&lock->val) & _Q_LOCKED_MASK; } /** @@ -108,6 +131,8 @@ static __always_inline void queued_spin_ */ static inline void queued_spin_unlock_wait(struct qspinlock *lock) { + /* See queued_spin_is_locked() */ + smp_mb(); while (atomic_read(&lock->val) & _Q_LOCKED_MASK) cpu_relax(); } Patches currently in stable-queue which might be from peterz@xxxxxxxxxxxxx are queue-4.5/locking-qspinlock-fix-spin_is_locked-and-spin_unlock_wait.patch queue-4.5/perf-core-fix-perf_event_open-vs.-execve-race.patch queue-4.5/perf-x86-intel-pt-generate-pmi-in-the-stop-region-as-well.patch queue-4.5/sched-loadavg-fix-loadavg-artifacts-on-fully-idle-and-on-fully-loaded-systems.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html