Re: [PATCH v2] KVM: x86: keep eoi exit bitmap accurate before loading it.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Il 27/08/2014 16:05, Wei Wang ha scritto:
> Guest may mask the IOAPIC entry before issue EOI. In such case,
> EOI will not be intercepted by the hypervisor, since the corresponding
> bit in eoi_exit_bitmap is not set after the masking of IOAPIC entry.
> 
> The solution here is to OR eoi_exit_bitmap with tmr to make sure that
> all level-triggered interrupts have their bits in eoi_exit_bitmap set.

This commit message does not explain why this change is necessary, and
the relationship between this patch and the previous one.

For example:

------
Commit 0f6c0a740b7d (KVM: x86: always exit on EOIs for interrupts listed
in the IOAPIC redir table, 2014-07-30) fixed an APICv bug where an
incorrect EOI exit bitmap triggered an interrupt storm inside the guest.

There is a corner case for which that patch would have disabled
accelerated EOI unnecessarily.  Suppose you have:

- a device that was the sole user of an INTx interrupt and is
hot-unplugged

- an OS that masks the INTx interrupt entry in the IOAPIC after the
unplug

- another device that uses MSI and is subsequently hot-plugged

If the OS chooses to reuse the same LAPIC interrupt vector for the two
interrupts, the patch would have left the vector in the EOI exit
bitmap, because KVM takes into account the stale entry in the IOAPIC
redirection table.

We do know exactly which masked interrupts are still in-service and
thus require broadcasting an EOI to the IOAPIC: this information is in
the TMR.  So, this patch ORs the EOI exit bitmap provided by the ioapic
with the TMR register.  Thanks to the previous patch, an active
level-triggered interrupt will always be included in the EOI exit
bitmap.
------

However, see below.

> Tested-by: Rongrong Liu <rongrongx.liu@xxxxxxxxx>
> Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx>
> Signed-off-by: Wei Wang <wei.w.wang@xxxxxxxxx>
> ---
>  arch/x86/kvm/lapic.c |   12 ++++++++++++
>  arch/x86/kvm/lapic.h |    1 +
>  arch/x86/kvm/x86.c   |    1 +
>  virt/kvm/ioapic.c    |    7 ++++---
>  4 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 8c1162d..0fcac3c 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -539,6 +539,18 @@ void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr)
>  	}
>  }
>  
> +void kvm_apic_update_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
> +{
> +	struct kvm_lapic *apic = vcpu->arch.apic;
> +	u32 i;
> +	u32 tmr;
> +
> +	for (i = 0; i < 8; i++) {
> +		tmr = kvm_apic_get_reg(apic, APIC_TMR + 0x10 * i);
> +		*((u32 *)eoi_exit_bitmap + i) |= tmr;
> +	}
> +}
> +
>  static void apic_update_ppr(struct kvm_lapic *apic)
>  {
>  	u32 tpr, isrv, ppr, old_ppr;
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index 6a11845..d2b96f2 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -55,6 +55,7 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu);
>  
>  void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr);
>  void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir);
> +void kvm_apic_update_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>  int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
>  int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
>  int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d401684..d23b558 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5992,6 +5992,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>  
>  	kvm_ioapic_scan_entry(vcpu, eoi_exit_bitmap, tmr);
>  	kvm_apic_update_tmr(vcpu, tmr);
> +	kvm_apic_update_eoi_exitmap(vcpu, eoi_exit_bitmap);
>  	kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap);
>  }
>  
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index e8ce34c..ed15936 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -254,9 +254,10 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap,
>  	spin_lock(&ioapic->lock);
>  	for (index = 0; index < IOAPIC_NUM_PINS; index++) {
>  		e = &ioapic->redirtbl[index];
> -		if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
> -		    kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC, index) ||
> -		    index == RTC_GSI) {
> +		if ((!e->fields.mask
> +			&& e->fields.trig_mode == IOAPIC_LEVEL_TRIG)
> +			|| kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
> +				index) || index == RTC_GSI) {
>  			if (kvm_apic_match_dest(vcpu, NULL, 0,
>  				e->fields.dest_id, e->fields.dest_mode)) {
>  				__set_bit(e->fields.vector,
> 

There's still something missing here.

Suppose you have the following:

    Program edge-triggered MSI for vector 123
    Interrupt comes in, ISR[123]=1
    Mask MSI
    Program level-triggered IOAPIC interrupt for vector 123
         >> Here, TMR[123] remains 0.
    Send EOI for vector 123

Now the TMR will not be updated in the LAPICs, and the EOI exit bitmap
will not be set correctly.

To fix this, we could drop all communication of the TMR from IOAPIC to
LAPIC.  Instead, do what the processor does and just modify the TMR in
__apic_accept_irq, based on the trig_mode.  If the TMR changes, you
trigger an IOAPIC scan that will also refresh the EOI exit bitmap.

For the hot-unplug/hot-plug case that Yang mentioned, the EOI exit
bitmap will be updated as soon as the first MSI arrives (the MSI is
edge-triggered).

But this is more tricky than it looks like.  Because the interrupt must
not be delivered until the EOI exit bitmap is right, you have to skip
posted interrupt delivery.  There could be races between the handling of
KVM_REQ_SCAN_IOAPIC and KVM_REQ_EVENT, which are hard to get right.  I'm
not sure it's worthwhile for what is after all a corner case.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux