On 24/07/2017 17:08, Wanpeng Li wrote: > 2017-07-24 22:45 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>: >> On 24/07/2017 10:57, Wanpeng Li wrote: >>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >>> >>> Preemption can occur in the preemption timer expiration handler: >>> >>> CPU0 CPU1 >>> >>> preemption timer vmexit >>> handle_preemption_timer(vCPU0) >>> kvm_lapic_expired_hv_timer >>> hv_timer_is_use == true >>> sched_out >>> sched_in >>> kvm_arch_vcpu_load >>> kvm_lapic_restart_hv_timer >>> restart_apic_timer >>> start_hv_timer >>> already-expired timer or sw timer triggerd in the window >>> start_sw_timer >>> cancel_hv_timer >> >> At this point, the timer interrupt is injected, right? > > Do you mean the new one on CPU1? I think we just set the pending > timer, we return back to kvm_lapic_expired_hv_timer() after preempt > notifier sched_in. start_sw_timer calls apic_timer_expired after cancel_hv_timer. >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >>> index 2819d4c..8341b40 100644 >>> --- a/arch/x86/kvm/lapic.c >>> +++ b/arch/x86/kvm/lapic.c >>> @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu) >>> { >>> struct kvm_lapic *apic = vcpu->arch.apic; >>> >>> - WARN_ON(!apic->lapic_timer.hv_timer_in_use); >>> - WARN_ON(swait_active(&vcpu->wq)); >>> - cancel_hv_timer(apic); >>> + preempt_disable(); >>> + if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) { >> >> Why is the "if" necessary? >> >> Maybe all of kvm_lapic_expired_hv_timer and start_sw_timer should be in >> preemption-disabled regions, which trivially avoids any reentrancy issue >> with the preempt notifier. Then, cancel_hv_timer can assert that it's >> called with preemption disabled. > > For example: > > static int handle_preemption_timer(struct kvm_vcpu *vcpu) > { > --------------------------------------------------> We still can > be preempted here, and do one cancel_hv_timer() Yes, so that just means you can do preempt_disable(); /* The preempt notifier has called apic_timer_expired already. */ if (!apic->lapic_timer.hv_timer_in_use) goto out; Thinking more about it, even _the caller_ of start_hv_timer and start_sw_timer should be in a preemption-disabled region for simplicity. That means that ultimately restart_apic_timer, kvm_lapic_expired_hv_timer and kvm_lapic_switch_to_sw_timer should call preempt_disable/preempt_enable. Paolo