Re: [PATCH V14 4/7] PCI: loongson: Don't access non-existant devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2022/6/28 上午5:38, Bjorn Helgaas wrote:
On Fri, Jun 17, 2022 at 03:43:27PM +0800, Huacai Chen wrote:
On LS2K/LS7A, some non-existant devices don't return 0xffffffff when
scanning. This is a hardware flaw but we can only avoid it by software
now.

We should say what *does* happen if we do a config read to a device
that doesn't exit.  Machine check, hang, etc?


The device is a hidden device(only for debug) that should not be scanned. If scanned in a non-normal way, the machine is hang(one case in ltp pci test can trigger the issue, which is explained blow).


Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
---
  drivers/pci/controller/pci-loongson.c | 19 +++++++++++++++++--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index a1222fc15454..e22142f75d97 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -134,10 +134,20 @@ static void __iomem *cfg0_map(struct loongson_pci *priv, int bus,
  	return priv->cfg0_base + addroff;
  }
+static bool pdev_is_existant(unsigned char bus, unsigned int device, unsigned int function)
+{
+	if ((bus == 0) && (device >= 9 && device <= 20) && (function > 0))
+		return false;

Why do you test pci_is_root_bus() below and "bus == 0" here?  I think
you intend them both to test the same thing.  If so, I think you
should test for "if (pci_is_root_bus(bus) ..." here.


I agree, I think we can only use pci_is_root_bus to do the work.


Generally speaking we only probe for functions > 0 if .0 is marked as
multi-function, so I guess this means 00:09.0 is marked as a
multi-function device, but config reads to 00:09.1 would fail?


Yes, definitely. Actually, the 00:09.0 is a single device, so fun1(09.1)
will not be scanned(e.g. the fun1 will be not scanned on pci enumeration
during kernel booting).

But, there is one situation: when running ltp pci test case on LS7A,
the 00:08.2 is a sata controller(a valid device), and the bus number(0)
and devfn(0x42) are inputted to kernel api pci_scan_slot(), which has
clear note: devfn must have zero function. So, apparently, the inputted
devfn's function is not zero, but 2, and then in the pci_scan_slot():


for (fn = next_fn(bus, dev, 0); fn > 0; fn = next_fn(bus, dev, fn)) {
                dev = pci_scan_single_device(bus, devfn + fn);
                ...
        }


08.2,08.3...and 09.1 will be scanned one by one, so the 09.1(fun1) is
scanned.

+	return true;

Returning "true" here means "the device *may* exist," not "this device
*does* exist," right?  If so, the function name probably should be
"pdev_may_exist()".


Yes, I think pdev_may_exist maybe better.

I guess that when we do a config read to a non-root bus device that
doesn't exist, e.g., "01:00.0", that read terminates with an
Unsupported Request error, the config read gets the ~0 data we expect?


Yes, I think so.

+}
+
  static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devfn,
  			       int where)
  {
  	unsigned char busnum = bus->number;
+	unsigned int device = PCI_SLOT(devfn);
+	unsigned int function = PCI_FUNC(devfn);
  	struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
if (pci_is_root_bus(bus))
@@ -147,8 +157,13 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
  	 * Do not read more than one device on the bus other than
  	 * the host bus.
  	 */
-	if (priv->data->flags & FLAG_DEV_FIX &&
-			!pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
+	if ((priv->data->flags & FLAG_DEV_FIX) && bus->self) {
+		if (!pci_is_root_bus(bus) && (device > 0))
+			return NULL;
+	}
+
+	/* Don't access non-existant devices */
+	if (!pdev_is_existant(busnum, device, function))
  		return NULL;

Is this a "forever" hardware bug that will never be fixed, or should
there be a flag like FLAG_DEV_FIX so we only do this on the broken
devices?


No, the next new version LS7A will correct it, so maybe we can use FLAG_DEV_FIX-like to address it.

  	/* CFG0 can only access standard space */
--
2.27.0





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux