2017-07-24 22:45 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>: > On 24/07/2017 10:57, Wanpeng Li wrote: >> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> >> Preemption can occur in the preemption timer expiration handler: >> >> CPU0 CPU1 >> >> preemption timer vmexit >> handle_preemption_timer(vCPU0) >> kvm_lapic_expired_hv_timer >> hv_timer_is_use == true >> sched_out >> sched_in >> kvm_arch_vcpu_load >> kvm_lapic_restart_hv_timer >> restart_apic_timer >> start_hv_timer >> already-expired timer or sw timer triggerd in the window >> start_sw_timer >> cancel_hv_timer > > At this point, the timer interrupt is injected, right? Do you mean the new one on CPU1? I think we just set the pending timer, we return back to kvm_lapic_expired_hv_timer() after preempt notifier sched_in. > > If this is correct, kvm_lapic_expired_hv_timer can just do nothing if > the timer is not in use, with a comment explaining that the preemption > notifier has run start_sw_timer and thus injected the timer interrupt. > >> /* back in kvm_lapic_expired_hv_timer */ >> cancel_hv_timer >> WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops >> >> This can be reproduced if CONFIG_PREEMPT is enabled. >> >> This patch fixes it by don't cancel preemption timer repeatedly if >> the preemption timer has already been cancelled due to preemption >> since already-expired timer or sw timer triggered in the window. >> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> >> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> --- >> arch/x86/kvm/lapic.c | 10 +++++++--- >> 1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >> index 2819d4c..8341b40 100644 >> --- a/arch/x86/kvm/lapic.c >> +++ b/arch/x86/kvm/lapic.c >> @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu) >> { >> struct kvm_lapic *apic = vcpu->arch.apic; >> >> - WARN_ON(!apic->lapic_timer.hv_timer_in_use); >> - WARN_ON(swait_active(&vcpu->wq)); >> - cancel_hv_timer(apic); >> + preempt_disable(); >> + if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) { > > Why is the "if" necessary? > > Maybe all of kvm_lapic_expired_hv_timer and start_sw_timer should be in > preemption-disabled regions, which trivially avoids any reentrancy issue > with the preempt notifier. Then, cancel_hv_timer can assert that it's > called with preemption disabled. For example: static int handle_preemption_timer(struct kvm_vcpu *vcpu) { --------------------------------------------------> We still can be preempted here, and do one cancel_hv_timer() preempt_disable(); kvm_lapic_expired_hv_timer(vcpu); -----> WARN_ON in cancel_hv_timer() even if we remove the WARN_ON in kvm_lapic_expired_hv_timer() as you mentioned above preempt_enable(); return 1; } Regards, Wanpeng Li