From: Wanpeng Li <wanpengli@xxxxxxxxxxx> The overhead of kvm_vcpu_kick() is huge since expensive rcu/memory barrier etc operations in rcuwait_wake_up(). It is worse when local delivery since the vCPU is scheduled and we still suffer from this. We can observe 12us+ for kvm_vcpu_kick() in kvm_pmu_deliver_pmi() path by ftrace before the patch and 6us+ after the optimization. Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx> --- arch/x86/kvm/lapic.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 76fb00921203..ec6997187c6d 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1120,7 +1120,8 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode, case APIC_DM_NMI: result = 1; kvm_inject_nmi(vcpu); - kvm_vcpu_kick(vcpu); + if (vcpu != kvm_get_running_vcpu()) + kvm_vcpu_kick(vcpu); break; case APIC_DM_INIT: -- 2.25.1