On Wed, 31 Jul 2024 08:37:39 +0530 Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> wrote: > On Tue, Jul 30, 2024 at 10:51:15AM -0700, Nirmal Patel wrote: > > On Tue, 30 Jul 2024 10:58:30 +0530 > > Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> wrote: > > > > > On Mon, Jul 29, 2024 at 01:08:59PM -0700, Nirmal Patel wrote: > > > > On Thu, 25 Jul 2024 09:40:13 +0530 > > > > Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> wrote: > > > > > > > > > On Wed, Jul 24, 2024 at 02:10:30PM -0500, Bjorn Helgaas > > > > > wrote: > > > > > > On Wed, Jul 24, 2024 at 10:00:40AM -0700, Nirmal Patel > > > > > > wrote: > > > > > > > VMD does not support legacy interrupts for devices > > > > > > > downstream from a VMD endpoint. So initialize the > > > > > > > PCI_INTERRUPT_LINE to 0 for these devices to ensure we > > > > > > > don't try to set up a legacy irq for them. > > > > > > > > > > > > s/legacy interrupts/INTx/ > > > > > > s/legacy irq/INTx/ > > > > > > > > > > > > > Note: This patch was proposed by Jim, I am trying to > > > > > > > upstream it. > > > > > > > > > > > > > > Signed-off-by: Jim Harris <james.r.harris@xxxxxxxxx> > > > > > > > Signed-off-by: Nirmal Patel <nirmal.patel@xxxxxxxxxxxxxxx> > > > > > > > --- > > > > > > > arch/x86/pci/fixup.c | 14 ++++++++++++++ > > > > > > > 1 file changed, 14 insertions(+) > > > > > > > > > > > > > > diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c > > > > > > > index b33afb240601..a3b34a256e7f 100644 > > > > > > > --- a/arch/x86/pci/fixup.c > > > > > > > +++ b/arch/x86/pci/fixup.c > > > > > > > @@ -653,6 +653,20 @@ static void quirk_no_aersid(struct > > > > > > > pci_dev *pdev) > > > > > > > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, > > > > > > > PCI_ANY_ID, PCI_CLASS_BRIDGE_PCI, 8, quirk_no_aersid); > > > > > > > +#if IS_ENABLED(CONFIG_VMD) > > > > > > > +/* > > > > > > > + * VMD does not support legacy interrupts for downstream > > > > > > > devices. > > > > > > > + * So PCI_INTERRPUT_LINE needs to be initialized to 0 to > > > > > > > ensure OS > > > > > > > + * doesn't try to configure a legacy irq. > > > > > > > > > > > > s/legacy interrupts/INTx/ > > > > > > s/PCI_INTERRPUT_LINE/PCI_INTERRUPT_LINE/ > > > > > > > > > > > > > + */ > > > > > > > +static void quirk_vmd_interrupt_line(struct pci_dev *dev) > > > > > > > +{ > > > > > > > + if (is_vmd(dev->bus)) > > > > > > > + pci_write_config_byte(dev, > > > > > > > PCI_INTERRUPT_LINE, 0); +} > > > > > > > +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, > > > > > > > quirk_vmd_interrupt_line); > > > > > > > > > > > > A quirk for every PCI device, even on systems without VMD, > > > > > > seems like kind of a clumsy way to deal with this. > > > > > > > > > > > > Conceptually, I would expect a host bridge driver (VMD acts > > > > > > like a host bridge in this case) to know whether it supports > > > > > > INTx, and if the driver knows it doesn't support INTx or it > > > > > > has no _PRT or DT description of INTx routing to use, an > > > > > > attempt to configure INTx should just fail naturally. > > > > > > > > > > > > I don't claim this is how host bridge drivers actually > > > > > > work; I just think it's the way they *should* work. > > > > > > > > > > > > > > > > Absolutely! This patch is fixing the issue in a wrong place. > > > > > There are existing DT based host bridge drivers that disable > > > > > INTx due to lack of hardware capability. The driver just need > > > > > to nullify pci_host_bridge::map_irq callback. > > > > > > > > > > - Mani > > > > > > > > > For VMD as a host bridge, pci_host_bridge::map_irq is null. > > > > Even all VMD rootports' PCI_INTERRUPT_LINE registers are set to > > > > 0. > > > > > > If map_irq is already NULL, then how INTx is being configured? In > > > your patch description: > > VMD uses MSIx. > > > > > > "So initialize the PCI_INTERRUPT_LINE to 0 for these devices to > > > ensure we don't try to set up a legacy irq for them." > > > > > > Who is 'we'? For sure the PCI core wouldn't set INTx in your case. > > > Does 'we' refer to device firmware? > > > > > > >Since VMD > > > > doesn't explicitly set PCI_INTERRUPT_LINE register to 0 for all > > > > of its sub-devices (i.e. NVMe), if some NVMes has non-zero > > > > value set for PCI_INTERRUPT_LINE (i.e. 0xff) then some software > > > > like SPDK can read it and make wrong assumption about INTx > > > > support. > > > > > > Is this statement is true (I haven't heard of before), then don't > > > we need to set PCI_INTERRUPT_LINE to 0 for all devices > > > irrespective of host bridge? > > Since VMD doesn't support legacy interrupt, BIOS sets > > PCI_INTERRUPT_LINE registers to 0 for all of the VMD rootports but > > not the NVMes'. > > > > According to PCIe base specs, "Values in this register are > > programmed by system software and are system architecture specific. > > The Function itself does not use this value; rather the value in > > this register is used by device drivers and operating systems." > > > > We had an issue raised on us sometime back because some SSDs have > > 0xff (i.e. Samsung) set to these registers by firmware and SPDK was > > reading them when SSDs were behind VMD which led them to believe > > VMD had INTx support enabled. After some testing, it made more > > sense to clear these registers for all of the VMD owned devices. > > > > This is a valuable information that should've been present in the > patch description. Now I can understand the intention of your patch. > Previously I couldn't. > > > > > > > > Based Bjorn's and your suggestion, it might be better if VMD > > > > sets PCI_INTERRUPT_LINE register for all of its sub-devices > > > > during VMD enumeration. > > > > > > > > > > What about hotplug devices? > > That is a good question and because of that I thought of putting the > > fix in fixup.c. But I am open to your suggestion since fixup is not > > the right place. > > > > How about the below change? > > diff --git a/drivers/pci/irq.c b/drivers/pci/irq.c > index 4555630be9ec..140df1138f14 100644 > --- a/drivers/pci/irq.c > +++ b/drivers/pci/irq.c > @@ -147,6 +147,13 @@ void pci_assign_irq(struct pci_dev *dev) > struct pci_host_bridge *hbrg = pci_find_host_bridge(dev->bus); > > if (!(hbrg->map_irq)) { > + /* > + * Some userspace applications like SPDK reads > + * PCI_INTERRUPT_LINE to decide whether INTx is > enabled or not. > + * So write 0 to make sure they understand that INTx > is disabled > + * for the device. > + */ > + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, 0); > pci_dbg(dev, "runtime IRQ mapping not provided by > arch\n"); return; > } > > > So this sets PCI_INTERRUPT_LINE to 0 for _all_ devices that don't > support INTx. As per your explanation above, the issue you are seeing > is not just applicable to VMD, but for all devices. > > - Mani > Thanks for the suggestion. Let me test the changes. -nirmal