On Tue, Jul 30, 2024 at 10:51:15AM -0700, Nirmal Patel wrote: > On Tue, 30 Jul 2024 10:58:30 +0530 > Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> wrote: > > > On Mon, Jul 29, 2024 at 01:08:59PM -0700, Nirmal Patel wrote: > > > On Thu, 25 Jul 2024 09:40:13 +0530 > > > Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> wrote: > > > > > > > On Wed, Jul 24, 2024 at 02:10:30PM -0500, Bjorn Helgaas wrote: > > > > > On Wed, Jul 24, 2024 at 10:00:40AM -0700, Nirmal Patel wrote: > > > > > > > > > > > VMD does not support legacy interrupts for devices downstream > > > > > > from a VMD endpoint. So initialize the PCI_INTERRUPT_LINE to 0 > > > > > > for these devices to ensure we don't try to set up a legacy > > > > > > irq for them. > > > > > > > > > > s/legacy interrupts/INTx/ > > > > > s/legacy irq/INTx/ > > > > > > > > > > > Note: This patch was proposed by Jim, I am trying to upstream > > > > > > it. > > > > > > > > > > > > Signed-off-by: Jim Harris <james.r.harris@xxxxxxxxx> > > > > > > Signed-off-by: Nirmal Patel <nirmal.patel@xxxxxxxxxxxxxxx> > > > > > > --- > > > > > > arch/x86/pci/fixup.c | 14 ++++++++++++++ > > > > > > 1 file changed, 14 insertions(+) > > > > > > > > > > > > diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c > > > > > > index b33afb240601..a3b34a256e7f 100644 > > > > > > --- a/arch/x86/pci/fixup.c > > > > > > +++ b/arch/x86/pci/fixup.c > > > > > > @@ -653,6 +653,20 @@ static void quirk_no_aersid(struct > > > > > > pci_dev *pdev) > > > > > > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, > > > > > > PCI_ANY_ID, PCI_CLASS_BRIDGE_PCI, 8, quirk_no_aersid); > > > > > > +#if IS_ENABLED(CONFIG_VMD) > > > > > > +/* > > > > > > + * VMD does not support legacy interrupts for downstream > > > > > > devices. > > > > > > + * So PCI_INTERRPUT_LINE needs to be initialized to 0 to > > > > > > ensure OS > > > > > > + * doesn't try to configure a legacy irq. > > > > > > > > > > s/legacy interrupts/INTx/ > > > > > s/PCI_INTERRPUT_LINE/PCI_INTERRUPT_LINE/ > > > > > > > > > > > + */ > > > > > > +static void quirk_vmd_interrupt_line(struct pci_dev *dev) > > > > > > +{ > > > > > > + if (is_vmd(dev->bus)) > > > > > > + pci_write_config_byte(dev, > > > > > > PCI_INTERRUPT_LINE, 0); +} > > > > > > +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, > > > > > > quirk_vmd_interrupt_line); > > > > > > > > > > A quirk for every PCI device, even on systems without VMD, seems > > > > > like kind of a clumsy way to deal with this. > > > > > > > > > > Conceptually, I would expect a host bridge driver (VMD acts > > > > > like a host bridge in this case) to know whether it supports > > > > > INTx, and if the driver knows it doesn't support INTx or it has > > > > > no _PRT or DT description of INTx routing to use, an attempt to > > > > > configure INTx should just fail naturally. > > > > > > > > > > I don't claim this is how host bridge drivers actually work; I > > > > > just think it's the way they *should* work. > > > > > > > > > > > > > Absolutely! This patch is fixing the issue in a wrong place. There > > > > are existing DT based host bridge drivers that disable INTx due to > > > > lack of hardware capability. The driver just need to nullify > > > > pci_host_bridge::map_irq callback. > > > > > > > > - Mani > > > > > > > For VMD as a host bridge, pci_host_bridge::map_irq is null. Even all > > > VMD rootports' PCI_INTERRUPT_LINE registers are set to 0. > > > > If map_irq is already NULL, then how INTx is being configured? In > > your patch description: > VMD uses MSIx. > > > > "So initialize the PCI_INTERRUPT_LINE to 0 for these devices to > > ensure we don't try to set up a legacy irq for them." > > > > Who is 'we'? For sure the PCI core wouldn't set INTx in your case. > > Does 'we' refer to device firmware? > > > > >Since VMD > > > doesn't explicitly set PCI_INTERRUPT_LINE register to 0 for all of > > > its sub-devices (i.e. NVMe), if some NVMes has non-zero value set > > > for PCI_INTERRUPT_LINE (i.e. 0xff) then some software like SPDK can > > > read it and make wrong assumption about INTx support. > > > > > > > Is this statement is true (I haven't heard of before), then don't we > > need to set PCI_INTERRUPT_LINE to 0 for all devices irrespective of > > host bridge? > Since VMD doesn't support legacy interrupt, BIOS sets > PCI_INTERRUPT_LINE registers to 0 for all of the VMD rootports but > not the NVMes'. > > According to PCIe base specs, "Values in this register are > programmed by system software and are system architecture specific. > The Function itself does not use this value; rather the value in this > register is used by device drivers and operating systems." > > We had an issue raised on us sometime back because some SSDs have 0xff > (i.e. Samsung) set to these registers by firmware and SPDK was reading > them when SSDs were behind VMD which led them to believe VMD had INTx > support enabled. After some testing, it made more sense to clear these > registers for all of the VMD owned devices. > This is a valuable information that should've been present in the patch description. Now I can understand the intention of your patch. Previously I couldn't. > > > > > Based Bjorn's and your suggestion, it might be better if VMD sets > > > PCI_INTERRUPT_LINE register for all of its sub-devices during VMD > > > enumeration. > > > > > > > What about hotplug devices? > That is a good question and because of that I thought of putting the > fix in fixup.c. But I am open to your suggestion since fixup is not the > right place. > How about the below change? diff --git a/drivers/pci/irq.c b/drivers/pci/irq.c index 4555630be9ec..140df1138f14 100644 --- a/drivers/pci/irq.c +++ b/drivers/pci/irq.c @@ -147,6 +147,13 @@ void pci_assign_irq(struct pci_dev *dev) struct pci_host_bridge *hbrg = pci_find_host_bridge(dev->bus); if (!(hbrg->map_irq)) { + /* + * Some userspace applications like SPDK reads + * PCI_INTERRUPT_LINE to decide whether INTx is enabled or not. + * So write 0 to make sure they understand that INTx is disabled + * for the device. + */ + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, 0); pci_dbg(dev, "runtime IRQ mapping not provided by arch\n"); return; } So this sets PCI_INTERRUPT_LINE to 0 for _all_ devices that don't support INTx. As per your explanation above, the issue you are seeing is not just applicable to VMD, but for all devices. - Mani -- மணிவண்ணன் சதாசிவம்