From: Will Deacon <will.deacon@xxxxxxx> commit b247be3fe89b6aba928bf80f4453d1c4ba8d2063 upstream. On x86, atomic_cond_read_relaxed will busy-wait with a cpu_relax() loop, so it is desirable to increase the number of times we spin on the qspinlock lockword when it is found to be transitioning from pending to locked. According to Waiman Long: | Ideally, the spinning times should be at least a few times the typical | cacheline load time from memory which I think can be down to 100ns or | so for each cacheline load with the newest systems or up to several | hundreds ns for older systems. which in his benchmarking corresponded to 512 iterations. Suggested-by: Waiman Long <longman@xxxxxxxxxx> Signed-off-by: Will Deacon <will.deacon@xxxxxxx> Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Acked-by: Waiman Long <longman@xxxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: boqun.feng@xxxxxxxxx Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx Cc: paulmck@xxxxxxxxxxxxxxxxxx Link: http://lkml.kernel.org/r/1524738868-31318-5-git-send-email-will.deacon@xxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> --- arch/x86/include/asm/qspinlock.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h index cf4cdf508ef42..2cb6624acaec6 100644 --- a/arch/x86/include/asm/qspinlock.h +++ b/arch/x86/include/asm/qspinlock.h @@ -6,6 +6,8 @@ #include <asm-generic/qspinlock_types.h> #include <asm/paravirt.h> +#define _Q_PENDING_LOOPS (1 << 9) + #define queued_spin_unlock queued_spin_unlock /** * queued_spin_unlock - release a queued spinlock -- 2.20.1