2016-07-07 20:29 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>: > > > On 07/07/2016 14:18, Wanpeng Li wrote: >> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> >> We will go to vcpu_run() loop after L0 emulates VMRESUME which maybe >> incur kvm_sched_out and kvm_sched_in operations since cond_resched() >> will be called once need resched. Preemption timer will be reprogrammed >> if vCPU is scheduled to a different pCPU. Then the preemption timer >> bit of vmcs02 will be set if L0 enable preemption timer to run L1 even >> if L1 doesn't enable preemption timer to run L2. >> >> This patch fix it by don't reprogram preemption timer of vmcs02 if L1's >> vCPU is scheduled on diffent pCPU when we are in the way to vmresume >> nested guest, and fallback to hrtimer based emulated method. >> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> >> Cc: Yunhong Jiang <yunhong.jiang@xxxxxxxxx> >> Cc: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> >> Cc: Haozhong Zhang <haozhong.zhang@xxxxxxxxx> >> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> --- >> v3 -> v4: >> * fallback to hrtimer based emulated method when in the way to vmresume nested guest >> >> arch/x86/kvm/x86.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 0cc6cf8..05137c0 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -2743,8 +2743,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) >> mark_tsc_unstable("KVM discovered backwards TSC"); >> >> if (kvm_lapic_hv_timer_in_use(vcpu) && >> + (is_guest_mode(vcpu) || >> kvm_x86_ops->set_hv_timer(vcpu, >> - kvm_get_lapic_tscdeadline_msr(vcpu))) >> + kvm_get_lapic_tscdeadline_msr(vcpu)))) >> kvm_lapic_switch_to_sw_timer(vcpu); >> if (check_tsc_unstable()) { >> u64 offset = kvm_compute_tsc_offset(vcpu, >> > > Thanks, this is good as a fallback. I'll try to fix it by getting the > pin-based execution controls right but if I fail this patch is okay. I believe we still need this patch even if you implement "L1 TSC deadline timer to trigger while L2 is running" eventually, the codes you posted before: exec_control = vmcs12->pin_based_vm_exec_control; +exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; exec_control |= vmcs_config.pin_based_exec_ctrl; - exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; + if (vmx->hv_deadline_tsc == -1) + exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; So there is still case the preemption timer bit of vmcs02 is not set, however, the scenario I mentioned above in kvm_arch_vcpu_load() will set it unnecessary. Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html