RE: [PATCH V4 10/11] vfio/pci: Support dynamic MSI-X

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Fri, 28 Apr 2023 06:50:01 +0000

> From: Chatre, Reinette <reinette.chatre@xxxxxxxxx>
> Sent: Friday, April 28, 2023 1:36 AM
> 
> pci_msix_alloc_irq_at() enables an individual MSI-X interrupt to be
> allocated after MSI-X enabling.
> 
> Use dynamic MSI-X (if supported by the device) to allocate an interrupt
> after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at
> the time a valid eventfd is assigned. This is different behavior from
> a range provided during MSI-X enabling where interrupts are allocated
> for the entire range whether a valid eventfd is provided for each
> interrupt or not.
> 
> The PCI-MSIX API requires that some number of irqs are allocated for
> an initial set of vectors when enabling MSI-X on the device. When
> dynamic MSIX allocation is not supported, the vector table, and thus
> the allocated irq set can only be resized by disabling and re-enabling
> MSI-X with a different range. In that case the irq allocation is
> essentially a cache for configuring vectors within the previously
> allocated vector range. When dynamic MSI-X allocation is supported,
> the API still requires some initial set of irqs to be allocated, but
> also supports allocating and freeing specific irq vectors both
> within and beyond the initially allocated range.
> 
> For consistency between modes, as well as to reduce latency and improve
> reliability of allocations, and also simplicity, this implementation
> only releases irqs via pci_free_irq_vectors() when either the interrupt
> mode changes or the device is released.

It improves the reliability of allocations from the calling device p.o.v.

But system-wide this is not efficient use of irqs and not releasing them
timely may affect the reliability of allocations for other devices.

Should this behavior be something configurable?

> 
> +/*
> + * Return Linux IRQ number of an MSI or MSI-X device interrupt vector.
> + * If a Linux IRQ number is not available then a new interrupt will be
> + * allocated if dynamic MSI-X is supported.
> + */
> +static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev,
> +			      unsigned int vector, bool msix)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	struct msi_map map;
> +	int irq;
> +	u16 cmd;
> +
> +	irq = pci_irq_vector(pdev, vector);
> +	if (irq > 0 || !msix || !vdev->has_dyn_msix)
> +		return irq;

if (irq >= 0 || ...)

> +
> +/*
> + * Where is vfio_msi_free_irq() ?
> + *
> + * Allocated interrupts are maintained, essentially forming a cache that
> + * subsequent allocations can draw from. Interrupts are freed using
> + * pci_free_irq_vectors() when MSI/MSI-X is disabled.
> + */

Probably merge it with the comment of vfio_msi_alloc_irq()?

> @@ -401,6 +430,12 @@ static int vfio_msi_set_vector_signal(struct
> vfio_pci_core_device *vdev,
>  	if (fd < 0)
>  		return 0;
> 
> +	if (irq == -EINVAL) {
> +		irq = vfio_msi_alloc_irq(vdev, vector, msix);
> +		if (irq < 0)
> +			return irq;
> +	}
> +
>  	ctx = vfio_irq_ctx_alloc(vdev, vector);
>  	if (!ctx)
>  		return -ENOMEM;

This doesn't read clean that an irq is allocated but not released
in the error unwind.