> From: Chatre, Reinette <reinette.chatre@xxxxxxxxx> > Sent: Friday, April 28, 2023 1:36 AM > > pci_msix_alloc_irq_at() enables an individual MSI-X interrupt to be > allocated after MSI-X enabling. > > Use dynamic MSI-X (if supported by the device) to allocate an interrupt > after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at > the time a valid eventfd is assigned. This is different behavior from > a range provided during MSI-X enabling where interrupts are allocated > for the entire range whether a valid eventfd is provided for each > interrupt or not. > > The PCI-MSIX API requires that some number of irqs are allocated for > an initial set of vectors when enabling MSI-X on the device. When > dynamic MSIX allocation is not supported, the vector table, and thus > the allocated irq set can only be resized by disabling and re-enabling > MSI-X with a different range. In that case the irq allocation is > essentially a cache for configuring vectors within the previously > allocated vector range. When dynamic MSI-X allocation is supported, > the API still requires some initial set of irqs to be allocated, but > also supports allocating and freeing specific irq vectors both > within and beyond the initially allocated range. > > For consistency between modes, as well as to reduce latency and improve > reliability of allocations, and also simplicity, this implementation > only releases irqs via pci_free_irq_vectors() when either the interrupt > mode changes or the device is released. It improves the reliability of allocations from the calling device p.o.v. But system-wide this is not efficient use of irqs and not releasing them timely may affect the reliability of allocations for other devices. Should this behavior be something configurable? > > +/* > + * Return Linux IRQ number of an MSI or MSI-X device interrupt vector. > + * If a Linux IRQ number is not available then a new interrupt will be > + * allocated if dynamic MSI-X is supported. > + */ > +static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev, > + unsigned int vector, bool msix) > +{ > + struct pci_dev *pdev = vdev->pdev; > + struct msi_map map; > + int irq; > + u16 cmd; > + > + irq = pci_irq_vector(pdev, vector); > + if (irq > 0 || !msix || !vdev->has_dyn_msix) > + return irq; if (irq >= 0 || ...) > + > +/* > + * Where is vfio_msi_free_irq() ? > + * > + * Allocated interrupts are maintained, essentially forming a cache that > + * subsequent allocations can draw from. Interrupts are freed using > + * pci_free_irq_vectors() when MSI/MSI-X is disabled. > + */ Probably merge it with the comment of vfio_msi_alloc_irq()? > @@ -401,6 +430,12 @@ static int vfio_msi_set_vector_signal(struct > vfio_pci_core_device *vdev, > if (fd < 0) > return 0; > > + if (irq == -EINVAL) { > + irq = vfio_msi_alloc_irq(vdev, vector, msix); > + if (irq < 0) > + return irq; > + } > + > ctx = vfio_irq_ctx_alloc(vdev, vector); > if (!ctx) > return -ENOMEM; This doesn't read clean that an irq is allocated but not released in the error unwind.