On Sat, Apr 29, 2023 at 3:58 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Fri, Apr 28, 2023 at 01:40:36PM -0500, Bjorn Helgaas wrote: > > On Thu, Apr 20, 2023 at 03:09:14PM +0800, korantwork@xxxxxxxxx wrote: > > > From: Xinghui Li <korantli@xxxxxxxxxxx> > > > What if you made boolean parameters like these: > > > > no_msi_remap > > > > If the VMD supports it, disable VMD MSI-X remapping. This > > improves interrupt performance because child device interrupts > > avoid the VMD MSI-X domain interrupt handler. > > > > msi_remap > > > > Remap child MSI-X interrupts into VMD MSI-X interrupts. This > > limits the number of MSI-X vectors available to the whole child > > device domain to the number of VMD MSI-X interrupts. > > I guess having two parameters that affect the same feature is also > confusing. Maybe just "msi_remap=0" or "msi_remap=1" or something? > > I think what makes "disable_msi_bypass=0" hard is that "MSI bypass" by > itself is a negative feature (the positive activity is MSI remapping), > and disabling bypass gets us back to the positive "MSI remapping" > situation, and "disable_msi_bypass=0" negates that again, so we're > back to ... uh ... let's see ... we are not disabling the bypass of > MSI remapping, so I guess MSI remapping would be *enabled*? Is that > right? > > Bjorn I am fine with these two ways naming of the param. Adjusting from enable_msi_remaping to disable_msi_bypass was aimed to trying address your comment about dealing with the device not supporting bypass. And in vmd drivers, the vmd bypass feature is enabled by adding the flag "VMD_FEAT_CAN_BYPASS_MSI_REMAP". Therefore, I think disabling bypass seems more appropriate. This patch aims to provide a convenient way to remove that flag in a specific case. On Sat, Apr 29, 2023 at 2:40 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > The "disable_msi_bypass" parameter name also leads to some complicated > logic. IIUC, "disable_msi_bypass=0" means "do not disable MSI remap > bypassing" or, in other words, "do not remap MSIs." This is only > supported by some VMDs. Using "disable_msi_bypass=0" to *enable* the > bypass feature is confusing. However, as you said, it does lead to some complicated logic. So, I also believe that these two approaches have their own pros and cons. > I still don't understand what causes the performance problem here. I > guess you see higher performance when the VMD remaps child MSIs? So > adding the VMD MSI-X domain interrupt handler and squashing all the > child MSI vectors into the VMD MSI vector space makes things better? > That seems backwards. Is this because of cache effects or something? > What does "excessive pressure on the PCIe node" mean? I assume the > PCIe node means the VMD? It receives the same number of child > interrupts in either case. What I mean is that there will be performance issues when a PCIe domain is fully loaded with 4 Gen4 NVMe devices. like this: +-[10002:00]-+-01.0-[01]----00.0 device0 | +-03.0-[02]----00.0 device1 | +-05.0-[03]----00.0 device2 | \-07.0-[04]----00.0 device3 According to the perf/irqtop tool, we found the os does get the same counts of interrupts in different modes. However, when the above situation occurs, we observed a significant increase in CPU idle time. Besides, the data and performance when using the bypass VMD feature were identical to when VMD was disabled. And after the devices mounted on a domain are reduced, the IOPS of individual devices will rebound somewhat. Therefore, we speculate that VMD can play a role in balancing and buffering interrupt loads. Therefore, in this situation, we believe that VMD ought to not be bypassed to handle MSI. > > Besides, this parameter does not affect the MSI-X working mode in > > guest. > I don't understand what you're saying here. From the patch, I think > that "disable_msi_bypass=1", i.e., "always remap child MSIs", means we > pretend this VMD doesn't support the VMCONFIG_MSI_REMAP bit. In that > case MSI remapping always happens. > If the user may need to use "disable_msi_bypass=1" (or "msi_remap") in > some virtualization scenarios, we should mention that and maybe give a > hint about what happens *without* that parameter. I apologize for the confusion, I think I missed the keyword 'passthrough mode'. In the virtualization scenarios, VMD doesn't support bypassing MSI-X when it is set to passthrough mode. PCI: vmd: Disable MSI-X remapping when possible(commit ee81ee8): + /* + * Currently MSI remapping must be enabled in guest passthrough mode + * due to some missing interrupt remapping plumbing. This is probably + * acceptable because the guest is usually CPU-limited and MSI + * remapping doesn't become a performance bottleneck. + */ This patch will not change this point, I just wanted to mention it again~ Thanks for your reviewing. I hope this reply can address your points.