2014-11-16 23:49+0200, Nadav Amit: > apic_find_highest_irr assumes irr_pending is set if any vector in APIC_IRR is > set. If this assumption is broken and apicv is disabled, the injection of > interrupts may be deferred until another interrupt is delivered to the guest. > Ultimately, if no other interrupt should be injected to that vCPU, the pending > interrupt may be lost. > > commit 56cc2406d68c ("KVM: nVMX: fix "acknowledge interrupt on exit" when APICv > is in use") changed the behavior of apic_clear_irr so irr_pending is cleared > after setting APIC_IRR vector. After this commit, if apic_set_irr and > apic_clear_irr run simultaneously, a race may occur, resulting in APIC_IRR > vector set, and irr_pending cleared. In the following example, assume a single > vector is set in IRR prior to calling apic_clear_irr: > > apic_set_irr apic_clear_irr > ------------ -------------- > apic->irr_pending = true; > apic_clear_vector(...); > vec = apic_search_irr(apic); > // => vec == -1 > apic_set_vector(...); > apic->irr_pending = (vec != -1); > // => apic->irr_pending == false > > Nonetheless, it appears the race might even occur prior to this commit: > > apic_set_irr apic_clear_irr > ------------ -------------- > apic->irr_pending = true; > apic->irr_pending = false; > apic_clear_vector(...); > if (apic_search_irr(apic) != -1) > apic->irr_pending = true; > // => apic->irr_pending == false > apic_set_vector(...); > > Fixing this issue by: > 1. Restoring the previous behavior of apic_clear_irr: clear irr_pending, call > apic_clear_vector, and then if APIC_IRR is non-zero, set irr_pending. > 2. On apic_set_irr: first call apic_set_vector, then set irr_pending. > > Signed-off-by: Nadav Amit <namit@xxxxxxxxxxxxxxxxx> > --- > arch/x86/kvm/lapic.c | 18 ++++++++++++------ > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 6e8ce5a..e0e5642 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -341,8 +341,12 @@ EXPORT_SYMBOL_GPL(kvm_apic_update_irr); > > static inline void apic_set_irr(int vec, struct kvm_lapic *apic) > { > - apic->irr_pending = true; > apic_set_vector(vec, apic->regs + APIC_IRR); > + /* > + * irr_pending must be true if any interrupt is pending; set it after > + * APIC_IRR to avoid race with apic_clear_irr > + */ > + apic->irr_pending = true; (A race that ends up with 'irr_pending = true' and zero IRR is harmless.) > } > > static inline int apic_search_irr(struct kvm_lapic *apic) > @@ -374,13 +378,15 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic) > > vcpu = apic->vcpu; > > - apic_clear_vector(vec, apic->regs + APIC_IRR); > - if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) > + if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) { > /* try to update RVI */ > + apic_clear_vector(vec, apic->regs + APIC_IRR); > kvm_make_request(KVM_REQ_EVENT, vcpu); > - else { > - vec = apic_search_irr(apic); > - apic->irr_pending = (vec != -1); > + } else { > + apic->irr_pending = false; > + apic_clear_vector(vec, apic->regs + APIC_IRR); > + if (apic_search_irr(apic) != -1) > + apic->irr_pending = true; > } Works because apic_clear_vector() is also a compiler barrier ... Reviewed-by: Radim Krčmář <rkrcmar@xxxxxxxxxx> (I hope the performance gain of irr_pending is worth its complexity.) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html