Commit-ID: 5aff60a191e579ae00ae5ca6ce16c13b687bc8a3 Gitweb: http://git.kernel.org/tip/5aff60a191e579ae00ae5ca6ce16c13b687bc8a3 Author: Pan Xinhui <xinhui.pan@xxxxxxxxxxxxxxxxxx> AuthorDate: Wed, 2 Nov 2016 05:08:29 -0400 Committer: Ingo Molnar <mingo@xxxxxxxxxx> CommitDate: Tue, 22 Nov 2016 12:48:10 +0100 locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock() An over-committed guest with more vCPUs than pCPUs has a heavy overload in osq_lock(). This is because if vCPU-A holds the osq lock and yields out, vCPU-B ends up waiting for per_cpu node->locked to be set. IOW, vCPU-B waits for vCPU-A to run and unlock the osq lock. Use the new vcpu_is_preempted(cpu) interface to detect if a vCPU is currently running or not, and break out of the spin-loop if so. test case: $ perf record -a perf bench sched messaging -g 400 -p && perf report before patch: 18.09% sched-messaging [kernel.vmlinux] [k] osq_lock 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is 2.49% sched-messaging [kernel.vmlinux] [k] system_call after patch: 20.68% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner 8.45% sched-messaging [kernel.vmlinux] [k] mutex_unlock 4.12% sched-messaging [kernel.vmlinux] [k] system_call 3.01% sched-messaging [kernel.vmlinux] [k] system_call_common 2.83% sched-messaging [kernel.vmlinux] [k] copypage_power7 2.64% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner 2.00% sched-messaging [kernel.vmlinux] [k] osq_lock Suggested-by: Boqun Feng <boqun.feng@xxxxxxxxx> Tested-by: Juergen Gross <jgross@xxxxxxxx> Signed-off-by: Pan Xinhui <xinhui.pan@xxxxxxxxxxxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Acked-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> Acked-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Cc: David.Laight@xxxxxxxxxx Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: benh@xxxxxxxxxxxxxxxxxxx Cc: bsingharora@xxxxxxxxx Cc: dave@xxxxxxxxxxxx Cc: kernellwp@xxxxxxxxx Cc: konrad.wilk@xxxxxxxxxx Cc: linuxppc-dev@xxxxxxxxxxxxxxxx Cc: mpe@xxxxxxxxxxxxxx Cc: paulmck@xxxxxxxxxxxxxxxxxx Cc: paulus@xxxxxxxxx Cc: rkrcmar@xxxxxxxxxx Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx Cc: will.deacon@xxxxxxx Cc: xen-devel-request@xxxxxxxxxxxxxxxxxxxx Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx Link: http://lkml.kernel.org/r/1478077718-37424-3-git-send-email-xinhui.pan@xxxxxxxxxxxxxxxxxx [ Translated to English. ] Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> --- kernel/locking/osq_lock.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 4ea2710..a316794 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -21,6 +21,11 @@ static inline int encode_cpu(int cpu_nr) return cpu_nr + 1; } +static inline int node_cpu(struct optimistic_spin_node *node) +{ + return node->cpu - 1; +} + static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val) { int cpu_nr = encoded_cpu_val - 1; @@ -118,8 +123,10 @@ bool osq_lock(struct optimistic_spin_queue *lock) while (!READ_ONCE(node->locked)) { /* * If we need to reschedule bail... so we can block. + * Use vcpu_is_preempted() to avoid waiting for a preempted + * lock holder: */ - if (need_resched()) + if (need_resched() || vcpu_is_preempted(node_cpu(node->prev))) goto unqueue; cpu_relax(); -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
![]() |