On Mon, Nov 11, 2024 at 11:32:27PM -0800, Jenishkumar Maheshbhai Patel wrote: > When the attached device recovers the link from > an external reset, the following error might be > seen upon pci rescan. > > On link-down event, it's not necessary to remove > the root bus. Only the child buses or devices > should be wiped off. However, the rescan operation > should be performed only when the link could be > retained. Otherwise, it should be done by a user > manually after the link is finally recovered. Wrap to fill 75 columns. s/pci/PCI/ s/bar/BAR/ (subject) > ~# echo 1 > /sys/bus/pci/rescan > [ 322.857504] pci 0000:01:00.0: [177d:b200] type 00 class 0x028000 > [ 322.863682] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x007fffff 64bit pref] > [ 322.871031] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x0fffffff 64bit pref] > [ 322.878362] pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x03ffffff 64bit pref] > [ 322.886845] pci 0000:01:00.0: reg 0x244: [mem 0x00000000-0x000fffff 64bit pref] > [ 322.894193] pci 0000:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x007fffff 64bit pref] (contains BAR0 for 8 VFs) > [ 322.905154] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x2 link at 0000:00:00.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link) > [ 322.921371] pcieport 0000:00:00.0: BAR 15: no space for [mem size 0x18000000 64bit pref] > [ 322.929507] pcieport 0000:00:00.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref] > [ 322.937999] pcieport 0000:00:00.0: BAR 15: no space for [mem size 0x18000000 64bit pref] > [ 322.946131] pcieport 0000:00:00.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref] > [ 322.954614] pci 0000:01:00.0: BAR 2: no space for [mem size 0x10000000 64bit pref] > [ 322.962225] pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x10000000 64bit pref] > [ 322.970193] pci 0000:01:00.0: BAR 4: no space for [mem size 0x04000000 64bit pref] > [ 322.977804] pci 0000:01:00.0: BAR 4: failed to assign [mem size 0x04000000 64bit pref] > [ 322.985766] pci 0000:01:00.0: BAR 0: no space for [mem size 0x00800000 64bit pref] > [ 322.993373] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00800000 64bit pref] > [ 323.001331] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00800000 64bit pref] > [ 323.008938] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00800000 64bit pref] > [ 323.016903] pci 0000:01:00.0: BAR 2: no space for [mem size 0x10000000 64bit pref] > [ 323.024511] pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x10000000 64bit pref] > [ 323.032469] pci 0000:01:00.0: BAR 4: no space for [mem size 0x04000000 64bit pref] > [ 323.040079] pci 0000:01:00.0: BAR 4: failed to assign [mem size 0x04000000 64bit pref] > [ 323.048037] pci 0000:01:00.0: BAR 0: no space for [mem size 0x00800000 64bit pref] > [ 323.055644] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00800000 64bit pref] > [ 323.063601] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00800000 64bit pref] > [ 323.071211] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00800000 64bit pref] > [ 323.081914] pcieport 0002:02:03.0: devices behind bridge are unusable because [bus 03] cannot be assigned for them > [ 323.092384] pcieport 0002:02:07.0: devices behind bridge are unusable because [bus 04] cannot be assigned for them > [ 323.102857] pcieport 0002:01:00.0: bridge has subordinate 02 but max busn 04 Remove timestamps; they don't help us understand. We probably don't need *all* the lines here to understand the problem. Collect output from current kernel, which should use more useful labels than "reg 0x10", "BAR 15", etc. > Signed-off-by: Jenishkumar Maheshbhai Patel <jpatel2@xxxxxxxxxxx> > --- > drivers/pci/controller/dwc/pcie-armada8k.c | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pcie-armada8k.c b/drivers/pci/controller/dwc/pcie-armada8k.c > index f9d6907900d1..ca2dedaa69a4 100644 > --- a/drivers/pci/controller/dwc/pcie-armada8k.c > +++ b/drivers/pci/controller/dwc/pcie-armada8k.c > @@ -231,6 +231,7 @@ static void armada8k_pcie_recover_link(struct work_struct *ws) > struct dw_pcie_rp *pp = &pcie->pci->pp; > struct pci_bus *bus = pp->bridge->bus; > struct pci_dev *root_port; > + struct pci_dev *child, *tmp; > int ret; > > root_port = pci_get_slot(bus, 0); > @@ -239,7 +240,14 @@ static void armada8k_pcie_recover_link(struct work_struct *ws) > return; > } > pci_lock_rescan_remove(); > - pci_stop_and_remove_bus_device(root_port); > + > + /* Remove all devices under root bus */ > + list_for_each_entry_safe(child, tmp, > + &root_port->subordinate->devices, bus_list) { > + pci_stop_and_remove_bus_device(child); > + dev_dbg(&child->dev, "removed\n"); > + } > + > /* Reset device if reset gpio is set */ > if (pcie->reset_gpio) { > /* assert and then deassert the reset signal */ > @@ -279,11 +287,12 @@ static void armada8k_pcie_recover_link(struct work_struct *ws) > > /* Wait until the link becomes active again */ > if (dw_pcie_wait_for_link(pcie->pci)) > - dev_err(pcie->pci->dev, "Link not up after reconfiguration\n"); > + goto fail; > + > + dev_dbg(pcie->pci->dev, "%s: link has been recovered\n", __func__); > > - bus = NULL; > - while ((bus = pci_find_next_bus(bus)) != NULL) > - pci_rescan_bus(bus); > + /* Rescan the root bus only if link is retained */ > + pci_rescan_bus(bus); > > fail: > pci_unlock_rescan_remove(); > -- > 2.25.1 >