Il 13/08/2014 21:16, Wei Wang ha scritto: > From: Yang Zhang <yang.z.zhang@xxxxxxxxx> > > Guest may mask the IOAPIC entry before issue EOI. In such case, > EOI will not be intercepted by hypervisor due to the corrensponding > bit in eoi exit bitmap is not setting. > > The solution is to check ISR + TMR to construct the EOI exit bitmap. > > This patch is a better fixing for the issue that commit "0f6c0a740b" > tries to solve. > > Tested-by: Alex Williamson <alex.williamson@xxxxxxxxxx> > Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx> > Signed-off-by: Wei Wang <wei.w.wang@xxxxxxxxx> > --- > arch/x86/kvm/lapic.c | 17 +++++++++++++++++ > arch/x86/kvm/lapic.h | 2 ++ > arch/x86/kvm/x86.c | 9 +++++++++ > virt/kvm/ioapic.c | 7 ++++--- > 4 files changed, 32 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 08e8a89..0ed4bcb 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -515,6 +515,23 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu) > __clear_bit(KVM_APIC_PV_EOI_PENDING, &vcpu->arch.apic_attention); > } > > +void kvm_apic_zap_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap, > + u32 *tmr) > +{ > + u32 i, reg_off, intr_in_service; > + struct kvm_lapic *apic = vcpu->arch.apic; > + > + for (i = 0; i < 8; i++) { > + reg_off = 0x10 * i; > + intr_in_service = apic_read_reg(apic, APIC_ISR + reg_off) & > + kvm_apic_get_reg(apic, APIC_TMR + reg_off); > + if (intr_in_service) { > + *((u32 *)eoi_exit_bitmap + i) |= intr_in_service; > + tmr[i] |= intr_in_service; > + } > + } > +} > + > void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr) > { > struct kvm_lapic *apic = vcpu->arch.apic; > diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h > index 6a11845..4ee3d70 100644 > --- a/arch/x86/kvm/lapic.h > +++ b/arch/x86/kvm/lapic.h > @@ -53,6 +53,8 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value); > u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu); > void kvm_apic_set_version(struct kvm_vcpu *vcpu); > > +void kvm_apic_zap_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap, > + u32 *tmr); > void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr); > void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir); > int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 204422d..755b556 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -6005,6 +6005,15 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu) > memset(tmr, 0, 32); > > kvm_ioapic_scan_entry(vcpu, eoi_exit_bitmap, tmr); > + /* > + * Guest may mask the IOAPIC entry before issue EOI. In such case, > + * EOI will not be intercepted by hypervisor due to the corrensponding > + * bit in eoi exit bitmap is not setting. > + * > + * The solution is to check ISR + TMR to construct the EOI exit bitmap. > + */ > + kvm_apic_zap_eoi_exitmap(vcpu, eoi_exit_bitmap, tmr); > + > kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap); > kvm_apic_update_tmr(vcpu, tmr); > } > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c > index e8ce34c..2458a1d 100644 > --- a/virt/kvm/ioapic.c > +++ b/virt/kvm/ioapic.c > @@ -254,9 +254,10 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap, > spin_lock(&ioapic->lock); > for (index = 0; index < IOAPIC_NUM_PINS; index++) { > e = &ioapic->redirtbl[index]; > - if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG || > - kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC, index) || > - index == RTC_GSI) { > + if (!e->fields.mask && > + (e->fields.trig_mode == IOAPIC_LEVEL_TRIG || > + kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC, > + index) || index == RTC_GSI)) { > if (kvm_apic_match_dest(vcpu, NULL, 0, > e->fields.dest_id, e->fields.dest_mode)) { > __set_bit(e->fields.vector, > Understanding the patch is complicated because of the subtle difference between tmr[] and apic_get_reg(..., APIC_TMR). I'd rather avoid that by first cleaning up the handling of TMR. Please split the patch in two: 1) one patch should move kvm_apic_update_tmr before kvm_x86_ops->load_eoi_exitmap, and it should set TMR to (~(IRR | ISR) & new_TMR) | ((IRR | ISR) & old_TMR) I'm using IRR|ISR here because the SDM says the TMR is only modified upon "acceptance of an interrupt into the IRR". We deviate from the spec by setting a value for the TMR even when the corresponding bit in IRR|ISR is 0; that's mostly invisible to guests, so it doesn't matter, but still the TMR should not change between acceptance of an interrupt into the IRR and the corresponding EOI cycle. 2) the second patch then can introduce the new logic to add ISR & TMR to the EOI exit map, and add back the !e->fields.mask check to ioapic.c, and make lapic.c OR the ISR & TMR value into the EOI exit map. This second patch need not handle the tmr[] array at all. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html