Re: [PATCH 1/3] KVM: Emulation MSI-X mask bits for assigned devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 12, 2010 at 02:49:58PM +0800, Sheng Yang wrote:
> On Monday 11 October 2010 18:01:00 Michael S. Tsirkin wrote:
> > On Mon, Oct 11, 2010 at 05:28:30PM +0800, Sheng Yang wrote:
> > > On Sunday 03 October 2010 19:12:47 Michael S. Tsirkin wrote:
> > > > On Tue, Sep 28, 2010 at 05:44:10PM +0800, Sheng Yang wrote:
> > > > > This patch enable per-vector mask for assigned devices using MSI-X.
> > > > > 
> > > > > Signed-off-by: Sheng Yang <sheng@xxxxxxxxxxxxxxx>
> > > > 
> > > > I think I see an issue here, noted below.  Some general comments:
> > > > - The mask bit seems broken for injecting interrupts from
> > > > 
> > > >   userspace (with interrupts/ioctls).
> > > >   I think we must test it on injection path.
> > > 
> > > I am not quite understand how it related to userspace interrupt injection
> > > here... This patch only cover assigned devices for now.
> > 
> > Well, this is a kernel/userspace interface, if it's broken for userspace
> > injection now we'll have to go through pain to fix it in a compatible
> > way later when we want to use it for userspace injection.
> > You might want to ask why we want the kernel to support making
> > userspace-injected interrupts when userspace can just avoid injecting
> > them, and the answer would be that with irqfd the injection might be
> > handled in a separate process.
> 
> OK, I've understood how it related to userspace interrupt injection. But I still 
> can't see why the interface is broken...
> > 
> > We currently handle this by destroying irqfd when irq is masked,
> > an ioctl instead would be much faster.
> > 
> > > > - We'll need a way to support the pending bit.
> > > > 
> > > >   Disabling interrupts will not let us do it.
> > > 
> > > We may need a way to support pending bit, though I don't know which guest
> > > has used it... And we can still know if there is interrupt pending by
> > > check the real hardware's pending bit if it's necessary.
> > 
> > That's what I'm saying: since instead of masking the vector in hardware
> > you disable irq in the APIC, the pending bit that we read from hardware
> > will not have the correct value.
> 
> Are you sure? This disable_irq() has nothing to do with APIC. The disable callback 
> in msi_chip didn't do anything but mark the IRQ status as IRQ_DISABLED, and the 
> follow interrupt(if there are any) would be acked and masked, using mask callback 
> in msi_chip. 
>
> irq_to_desc() need to be exported for my initial version, in order to use mask 
> callback. But later I think it would be clear and better if we use general IRQ 
> function to do it. And I don't think the current solution would prevent us from 
> reading hardware pending bits.

Not sure, I'll try to look into code later,  but just based on this
description:

on real hardware:
	mask
	interrupt
results in both pending bit being set

on guest with assigned device
	mask
	interrupt
results in mask being set but pending bit not set
(as interrupt was already sent)

So if we try to look at pending bits looks like we'll miss
some interrupts. No?

> 
> --
> regards
> Yang, Sheng
> 
> > 
> > If we fix this, pending bit handling can be done by userspace.
> > 
> > > (And we haven't seen any problem by
> > > leaving the bit 0 so far, and it's not in this patch's scope.)
> > 
> > I don't know about anyone using this, either, but the PCI spec does
> > require support of polling mode where the pending bit is polled instead
> > of interrupts. So yes, not a high priority to implement, but let's give
> > the way we intend to support this in the future some thought.
> > 
> > > > > ---
> > > > > 
> > > > >  arch/x86/kvm/x86.c       |    1 +
> > > > >  include/linux/kvm.h      |    9 ++++++++-
> > > > >  include/linux/kvm_host.h |    1 +
> > > > >  virt/kvm/assigned-dev.c  |   39
> > > 
> > > +++++++++++++++++++++++++++++++++++++++
> > > 
> > > > >  4 files changed, 49 insertions(+), 1 deletions(-)
> > > > > 
> > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > index 8412c91..e6933e6 100644
> > > > > --- a/arch/x86/kvm/x86.c
> > > > > +++ b/arch/x86/kvm/x86.c
> > > > > @@ -1927,6 +1927,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > > 
> > > > >  	case KVM_CAP_DEBUGREGS:
> > > > >  	case KVM_CAP_X86_ROBUST_SINGLESTEP:
> > > > > 
> > > > >  	case KVM_CAP_XSAVE:
> > > > > +	case KVM_CAP_DEVICE_MSIX_MASK:
> > > > >  		r = 1;
> > > > >  		break;
> > > > >  	
> > > > >  	case KVM_CAP_COALESCED_MMIO:
> > > > > diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> > > > > index 919ae53..f2b7cdc 100644
> > > > > --- a/include/linux/kvm.h
> > > > > +++ b/include/linux/kvm.h
> > > > > @@ -540,6 +540,9 @@ struct kvm_ppc_pvinfo {
> > > > > 
> > > > >  #endif
> > > > >  #define KVM_CAP_PPC_GET_PVINFO 57
> > > > >  #define KVM_CAP_PPC_IRQ_LEVEL 58
> > > > > 
> > > > > +#ifdef __KVM_HAVE_MSIX
> > > > > +#define KVM_CAP_DEVICE_MSIX_MASK 59
> > > > > +#endif
> > > > > 
> > > > >  #ifdef KVM_CAP_IRQ_ROUTING
> > > > > 
> > > > > @@ -787,11 +790,15 @@ struct kvm_assigned_msix_nr {
> > > > > 
> > > > >  };
> > > > >  
> > > > >  #define KVM_MAX_MSIX_PER_DEV		256
> > > > > 
> > > > > +
> > > > > +#define KVM_MSIX_FLAG_MASK	1
> > > > > +
> > > > > 
> > > > >  struct kvm_assigned_msix_entry {
> > > > >  
> > > > >  	__u32 assigned_dev_id;
> > > > >  	__u32 gsi;
> > > > >  	__u16 entry; /* The index of entry in the MSI-X table */
> > > > > 
> > > > > -	__u16 padding[3];
> > > > > +	__u16 flags;
> > > > > +	__u16 padding[2];
> > > > > 
> > > > >  };
> > > > >  
> > > > >  #endif /* __LINUX_KVM_H */
> > > > > 
> > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > > index 0b89d00..a43405c 100644
> > > > > --- a/include/linux/kvm_host.h
> > > > > +++ b/include/linux/kvm_host.h
> > > > > @@ -415,6 +415,7 @@ struct kvm_irq_ack_notifier {
> > > > > 
> > > > >  };
> > > > >  
> > > > >  #define KVM_ASSIGNED_MSIX_PENDING		0x1
> > > > > 
> > > > > +#define KVM_ASSIGNED_MSIX_MASK			0x2
> > > > > 
> > > > >  struct kvm_guest_msix_entry {
> > > > >  
> > > > >  	u32 vector;
> > > > >  	u16 entry;
> > > > > 
> > > > > diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> > > > > index 7c98928..15b8c32 100644
> > > > > --- a/virt/kvm/assigned-dev.c
> > > > > +++ b/virt/kvm/assigned-dev.c
> > > > > @@ -17,6 +17,8 @@
> > > > > 
> > > > >  #include <linux/pci.h>
> > > > >  #include <linux/interrupt.h>
> > > > >  #include <linux/slab.h>
> > > > > 
> > > > > +#include <linux/irqnr.h>
> > > > > +
> > > > > 
> > > > >  #include "irq.h"
> > > > >  
> > > > >  static struct kvm_assigned_dev_kernel *kvm_find_assigned_dev(struct
> > > > >  list_head *head,
> > > > > 
> > > > > @@ -666,6 +668,30 @@ msix_nr_out:
> > > > >  	return r;
> > > > >  
> > > > >  }
> > > > > 
> > > > > +static void update_msix_mask(struct kvm_assigned_dev_kernel
> > > > > *assigned_dev, +			     int index)
> > > > > +{
> > > > > +	int irq;
> > > > > +
> > > > > +	if (!assigned_dev->dev->msix_enabled ||
> > > > > +	    !(assigned_dev->irq_requested_type & KVM_DEV_IRQ_HOST_MSIX))
> > > > > +		return;
> > > > > +
> > > > > +	irq = assigned_dev->host_msix_entries[index].vector;
> > > > > +
> > > > > +	ASSERT(irq != 0);
> > > > > +
> > > > > +	if (assigned_dev->guest_msix_entries[index].flags &
> > > > > +			KVM_ASSIGNED_MSIX_MASK)
> > > > > +		disable_irq(irq);
> > > > > +	else {
> > > > > +		enable_irq(irq);
> > > > > +		if (assigned_dev->guest_msix_entries[index].flags &
> > > > > +				KVM_ASSIGNED_MSIX_PENDING)
> > > > > +			schedule_work(&assigned_dev->interrupt_work);
> > > > > +	}
> > > > > +}
> > > > > +
> > > > 
> > > > What happens if guest masks an entry and then we hot-unplug the
> > > > device and remove it from guest? It looks like interrupt
> > > > will stay disabled?
> > > 
> > > I don't think so. pci_disable_msix() which was called in hot-unplug path
> > > would recycle all IRQs used by the device. It should be the same as VM
> > > shutdown.
> > > 
> > > Also before the IRQ was recycled, I believe the same dynamic IRQ wouldn't
> > > be used by other devices.
> > > 
> > > --
> > > regards
> > > Yang, Sheng
> > > 
> > > > >  static int kvm_vm_ioctl_set_msix_entry(struct kvm *kvm,
> > > > >  
> > > > >  				       struct kvm_assigned_msix_entry *entry)
> > > > >  
> > > > >  {
> > > > > 
> > > > > @@ -688,6 +714,19 @@ static int kvm_vm_ioctl_set_msix_entry(struct
> > > > > kvm *kvm,
> > > > > 
> > > > >  			adev->guest_msix_entries[i].entry = entry->entry;
> > > > >  			adev->guest_msix_entries[i].vector = entry->gsi;
> > > > >  			adev->host_msix_entries[i].entry = entry->entry;
> > > > > 
> > > > > +			if ((entry->flags & KVM_MSIX_FLAG_MASK) &&
> > > > > +					!(adev->guest_msix_entries[i].flags &
> > > > > +					KVM_ASSIGNED_MSIX_MASK)) {
> > > > > +				adev->guest_msix_entries[i].flags |=
> > > > > +					KVM_ASSIGNED_MSIX_MASK;
> > > > > +				update_msix_mask(adev, i);
> > > > > +			} else if (!(entry->flags & KVM_MSIX_FLAG_MASK) &&
> > > > > +					(adev->guest_msix_entries[i].flags &
> > > > > +					KVM_ASSIGNED_MSIX_MASK)) {
> > > > > +				adev->guest_msix_entries[i].flags &=
> > > > > +					~KVM_ASSIGNED_MSIX_MASK;
> > > > > +				update_msix_mask(adev, i);
> > > > > +			}
> > > > > 
> > > > >  			break;
> > > > >  		
> > > > >  		}
> > > > >  	
> > > > >  	if (i == adev->entries_nr) {
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux