On 28/09/2016 01:07, Michael S. Tsirkin wrote: > On Tue, Sep 27, 2016 at 11:20:12PM +0200, Paolo Bonzini wrote: >> Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU >> is blocked", 2015-09-18) the posted interrupt descriptor is checked >> unconditionally for PIR.ON. Therefore we don't need KVM_REQ_EVENT to >> trigger the scan and, if NMIs or SMIs are not involved, we can avoid >> the complicated event injection path. >> >> However, there is a race between vmx_deliver_posted_interrupt and >> vcpu_enter_guest. Fix it by disabling interrupts before vcpu->mode is >> set to IN_GUEST_MODE. > > Could you describe the race a bit more please? > I'm surprised that locally disabling irqs > fixes a race with a different CPUs. The posted interrupt IPI has a dummy handler in arch/x86/kernel/irq.c (smp_kvm_posted_intr_ipi). It only does something if it is received while the guest is running. So local_irq_disable has an interesting side effect. Because the interrupt will not be processed until the guest is entered, local_irq_disable effectively switches the IRQ handler from the dummy handler to the processor's posted interrupt handling. So you want to do that before setting IN_GUEST_MODE, otherwise the IPI sent by deliver_posted_interrupt is ignored. However, the patch is wrong, because this bit: if (kvm_lapic_enabled(vcpu)) { /* * Update architecture specific hints for APIC * virtual interrupt delivery. */ if (vcpu->arch.apicv_active) kvm_x86_ops->hwapic_irr_update(vcpu, kvm_lapic_find_highest_irr(vcpu)); } also has to be moved after setting IN_GUEST_MODE. Basically the order for interrupt injection is: (1) set PIR smp_wmb() (2) set ON smp_mb() (3) read vcpu->mode if IN_GUEST_MODE (4a) send posted interrupt IPI else (4b) kick (i.e. cmpxchg vcpu->mode from IN_GUEST_MODE to EXITING_GUEST_MODE and send reschedule IPI) while the order for entering the guest must be the opposite. The numbers on the left identify the pairing between interrupt injection and vcpu_entr_guest (4a) enable posted interrupt processing (i.e. disable interrupts!) (3) set vcpu->mode to IN_GUEST_MODE smp_mb() (2) read ON if ON then (1) read PIR sync PIR to IRR (4b) read vcpu->mode if vcpu->mode == EXITING_GUEST_MODE then cancel vmentry (3/2/1) # posted interrupts are processed on the next vmentry Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html