On 2022/6/28 上午5:38, Bjorn Helgaas wrote:
On Fri, Jun 17, 2022 at 03:43:27PM +0800, Huacai Chen wrote:
On LS2K/LS7A, some non-existant devices don't return 0xffffffff when
scanning. This is a hardware flaw but we can only avoid it by software
now.
We should say what *does* happen if we do a config read to a device
that doesn't exit. Machine check, hang, etc?
The device is a hidden device(only for debug) that should not be
scanned. If scanned in a non-normal way, the machine is hang(one case in
ltp pci test can trigger the issue, which is explained blow).
Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
---
drivers/pci/controller/pci-loongson.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index a1222fc15454..e22142f75d97 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -134,10 +134,20 @@ static void __iomem *cfg0_map(struct loongson_pci *priv, int bus,
return priv->cfg0_base + addroff;
}
+static bool pdev_is_existant(unsigned char bus, unsigned int device, unsigned int function)
+{
+ if ((bus == 0) && (device >= 9 && device <= 20) && (function > 0))
+ return false;
Why do you test pci_is_root_bus() below and "bus == 0" here? I think
you intend them both to test the same thing. If so, I think you
should test for "if (pci_is_root_bus(bus) ..." here.
I agree, I think we can only use pci_is_root_bus to do the work.
Generally speaking we only probe for functions > 0 if .0 is marked as
multi-function, so I guess this means 00:09.0 is marked as a
multi-function device, but config reads to 00:09.1 would fail?
Yes, definitely. Actually, the 00:09.0 is a single device, so fun1(09.1)
will not be scanned(e.g. the fun1 will be not scanned on pci enumeration
during kernel booting).
But, there is one situation: when running ltp pci test case on LS7A,
the 00:08.2 is a sata controller(a valid device), and the bus number(0)
and devfn(0x42) are inputted to kernel api pci_scan_slot(), which has
clear note: devfn must have zero function. So, apparently, the inputted
devfn's function is not zero, but 2, and then in the pci_scan_slot():
for (fn = next_fn(bus, dev, 0); fn > 0; fn = next_fn(bus, dev,
fn)) {
dev = pci_scan_single_device(bus, devfn + fn);
...
}
08.2,08.3...and 09.1 will be scanned one by one, so the 09.1(fun1) is
scanned.
+ return true;
Returning "true" here means "the device *may* exist," not "this device
*does* exist," right? If so, the function name probably should be
"pdev_may_exist()".
Yes, I think pdev_may_exist maybe better.
I guess that when we do a config read to a non-root bus device that
doesn't exist, e.g., "01:00.0", that read terminates with an
Unsupported Request error, the config read gets the ~0 data we expect?
Yes, I think so.
+}
+
static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devfn,
int where)
{
unsigned char busnum = bus->number;
+ unsigned int device = PCI_SLOT(devfn);
+ unsigned int function = PCI_FUNC(devfn);
struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
if (pci_is_root_bus(bus))
@@ -147,8 +157,13 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
* Do not read more than one device on the bus other than
* the host bus.
*/
- if (priv->data->flags & FLAG_DEV_FIX &&
- !pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
+ if ((priv->data->flags & FLAG_DEV_FIX) && bus->self) {
+ if (!pci_is_root_bus(bus) && (device > 0))
+ return NULL;
+ }
+
+ /* Don't access non-existant devices */
+ if (!pdev_is_existant(busnum, device, function))
return NULL;
Is this a "forever" hardware bug that will never be fixed, or should
there be a flag like FLAG_DEV_FIX so we only do this on the broken
devices?
No, the next new version LS7A will correct it, so maybe we can use
FLAG_DEV_FIX-like to address it.
/* CFG0 can only access standard space */
--
2.27.0