From: Xinghui Li <korantli@xxxxxxxxxxx> In the past, the vmd MSI mode can only be adjusted by configuring vmd_ids table. This patch adds another way to adjust MSI mode by adjusting module parameter, which allows users easier to adjust the vmd according to the I/O scenario without rebuilding driver. - "disable_msi_bypass=0 or other values": Under normal circumstances, we recommend enable the VMD MSI-X bypass feature, which improves interrupt handling performance by avoiding the VMD MSI-X domain interrupt handler. - "disable_msi_bypass=1": Use this when multiple NVMe devices are mounted on the same PCIe node with a high volume of 4K random I/O. It mitigates excessive pressure on the PCIe node caused by numerous interrupts from NVMe drives, resulting in improved I/O performance. Such as: In FIO 4K random test when 4 NVME(Gen4) mounted on the same PCIE port: - Enable bypass: read: IOPS=562k, BW=2197MiB/s, io=644GiB - Disable bypass: read: IOPS=1144k, BW=4470MiB/s, io=1310GiB As not all devices support VMD MSI-X bypass, this parameter is only applicable to devices that support the bypass function and have already enabled it, such as VMD_28C0. Besides, this parameter does not affect the MSI-X working mode in guest. Signed-off-by: Xinghui Li <korantli@xxxxxxxxxxx> --- drivers/pci/controller/vmd.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 990630ec57c6..8ee673810cbf 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -34,6 +34,20 @@ #define MB2_SHADOW_OFFSET 0x2000 #define MB2_SHADOW_SIZE 16 +/* + * The VMD disable_msi_bypass module parameter provides the alternative + * way to adjust MSI mode when loading vmd.ko. This parameter is only applicable + * to devices that both support and have enabled bypass, such as VMD_28C0. + * Besides, it does not affect MSI-X mode in the guest. + * + * 1: disable MSI-X bypass + * other values: not disable MSI-X bypass + */ +static int disable_msi_bypass; +module_param(disable_msi_bypass, int, 0444); +MODULE_PARM_DESC(disable_msi_bypass, "Whether to disable MSI-X bypass function.\n" + "\t\t Only effective on the device supporting bypass, such as 28C0."); + enum vmd_features { /* * Device may contain registers which hint the physical location of the @@ -875,6 +889,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) return ret; vmd_set_msi_remapping(vmd, true); + dev_info(&vmd->dev->dev, "init vmd with remapping MSI-X\n"); ret = vmd_create_irq_domain(vmd); if (ret) @@ -887,6 +902,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI); } else { vmd_set_msi_remapping(vmd, false); + dev_info(&vmd->dev->dev, "init vmd with bypass MSI-X\n"); } pci_add_resource(&resources, &vmd->resources[0]); @@ -955,6 +971,17 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) return 0; } +static void vmd_config_msi_bypass_param(unsigned long *features) +{ + /* + * Not every VMD device supports and enables bypass MSI-X. + * Make sure current device has the bypass flag set. + */ + if (disable_msi_bypass == 1 && + *features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) + *features &= ~(VMD_FEAT_CAN_BYPASS_MSI_REMAP); +} + static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id) { unsigned long features = (unsigned long) id->driver_data; @@ -984,6 +1011,8 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id) if (err < 0) goto out_release_instance; + vmd_config_msi_bypass_param(&features); + vmd->cfgbar = pcim_iomap(dev, VMD_CFGBAR, 0); if (!vmd->cfgbar) { err = -ENOMEM; -- 2.31.1