From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> We will go to vcpu_run() loop after L0 emulates VMRESUME which maybe incur kvm_sched_out and kvm_sched_in operations since cond_resched() will be called once need resched. Preemption timer will be reprogrammed if vCPU is scheduled to a different pCPU. Then the preemption timer bit of vmcs02 will be set if L0 enable preemption timer to run L1 even if L1 doesn't enable preemption timer to run L2. This patch fix it by don't reprogram preemption timer of vmcs02 if L1's vCPU is scheduled on diffent pCPU when we are in the way to vmresume nested guest, and fallback to hrtimer based emulated method. Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> Cc: Yunhong Jiang <yunhong.jiang@xxxxxxxxx> Cc: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> Cc: Haozhong Zhang <haozhong.zhang@xxxxxxxxx> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> --- v3 -> v4: * fallback to hrtimer based emulated method when in the way to vmresume nested guest arch/x86/kvm/x86.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0cc6cf8..05137c0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2743,8 +2743,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) mark_tsc_unstable("KVM discovered backwards TSC"); if (kvm_lapic_hv_timer_in_use(vcpu) && + (is_guest_mode(vcpu) || kvm_x86_ops->set_hv_timer(vcpu, - kvm_get_lapic_tscdeadline_msr(vcpu))) + kvm_get_lapic_tscdeadline_msr(vcpu)))) kvm_lapic_switch_to_sw_timer(vcpu); if (check_tsc_unstable()) { u64 offset = kvm_compute_tsc_offset(vcpu, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html