On Wed, 24 May 2017 22:13:18 +0200 Eric Auger <eric.auger@xxxxxxxxxx> wrote: > We add two new fields in vfio_pci_irq_ctx struct: deoi and handler. > If deoi is set, this means the physical IRQ attached to the virtual > IRQ is directly deactivated by the guest and the VFIO driver does > not need to disable the physical IRQ and mask it at VFIO level. > > The handler pointer is set accordingly and a wrapper handler is > introduced that calls the chosen handler function. > > Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx> > > --- > --- > drivers/vfio/pci/vfio_pci_intrs.c | 32 ++++++++++++++++++++++++++------ > drivers/vfio/pci/vfio_pci_private.h | 2 ++ > 2 files changed, 28 insertions(+), 6 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c > index d4d377b..06aa713 100644 > --- a/drivers/vfio/pci/vfio_pci_intrs.c > +++ b/drivers/vfio/pci/vfio_pci_intrs.c > @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev) > static irqreturn_t vfio_intx_handler(int irq, void *dev_id) > { > struct vfio_pci_device *vdev = dev_id; > - unsigned long flags; > int ret = IRQ_NONE; > > - spin_lock_irqsave(&vdev->irqlock, flags); > - > if (!vdev->pci_2_3) { > disable_irq_nosync(vdev->pdev->irq); > vdev->ctx[0].automasked = true; > @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) > ret = IRQ_HANDLED; > } > > - spin_unlock_irqrestore(&vdev->irqlock, flags); > - > if (ret == IRQ_HANDLED) > vfio_send_intx_eventfd(vdev, NULL); > > return ret; > } > > +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id) > +{ > + struct vfio_pci_device *vdev = dev_id; > + > + vfio_send_intx_eventfd(vdev, NULL); > + return IRQ_HANDLED; > +} > + > +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id) > +{ > + struct vfio_pci_device *vdev = dev_id; > + unsigned long flags; > + irqreturn_t ret; > + > + spin_lock_irqsave(&vdev->irqlock, flags); > + ret = vdev->ctx[0].handler(irq, dev_id); > + spin_unlock_irqrestore(&vdev->irqlock, flags); > + > + return ret; > +} > + > static int vfio_intx_enable(struct vfio_pci_device *vdev) > { > if (!is_irq_none(vdev)) > @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd) > if (!vdev->pci_2_3) > irqflags = 0; > > - ret = request_irq(pdev->irq, vfio_intx_handler, > + if (vdev->ctx[0].deoi) > + vdev->ctx[0].handler = vfio_intx_handler_deoi; > + else > + vdev->ctx[0].handler = vfio_intx_handler; > + ret = request_irq(pdev->irq, vfio_intx_wrapper_handler, > irqflags, vdev->ctx[0].name, vdev); Here's where I think we don't account for irqflags properly. If we get a shared interrupt here, then enabling direct EOI needs to be disabled or else we'll starve other devices sharing the interrupt. In practice, I wonder if this makes PCI direct EOI a useful feature. We could try to get an exclusive interrupt and fallback to shared, but any time we get an exclusive interrupt we're more prone to conflicts with other devices. I might have two VMs that share an interrupt and now it's a race that only the first to setup an IRQ can work. Worse, one of those VMs might be fully booted and switched to MSI and now it's just a matter of time until they reboot in the right way to generate a conflict. I might also have two devices in the same VM that share an IRQ and now I can't start the VM at all because the second device can no longer get an interrupt. This is the same problem we have with the nointxmask flag, it's a useful debugging feature but since the masking is done at the APIC/GIC rather than the device, much like here, it's not very practical for more than debugging and isolating specific devices as requiring APIC/GIC level masking. I'm not sure how to proceed on the PCI side here. Thanks, Alex > if (ret) { > vdev->ctx[0].trigger = NULL; > diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h > index f7f1101..5cfe59a 100644 > --- a/drivers/vfio/pci/vfio_pci_private.h > +++ b/drivers/vfio/pci/vfio_pci_private.h > @@ -36,6 +36,8 @@ struct vfio_pci_irq_ctx { > char *name; > bool masked; > bool automasked; > + bool deoi; > + irqreturn_t (*handler)(int irq, void *dev_id); > struct irq_bypass_producer producer; > }; >