On Thu, Oct 06, 2016 at 08:13:58AM -0500, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > Hi Serge, > > On Thu, Oct 06, 2016 at 12:34:15PM +0300, Serge Semin wrote: > > Hello linux folks, > > > > Sometime ago I discovered a kernel panic popping up when PCI subsystem was > > trying to enumerate PCI express bus with ASPM service enabled. Here it is: > > > > [ 5.089667] CPU 0 Unable to handle kernel paging request at virtual > > address 00000060, epc == 80317004, ra == 80316ac8 > > [ 5.120952] Oops[#1]: > > ... > > [ 5.528438] Call Trace: > > [ 5.535640] [<80317004>] pcie_aspm_init_link_state+0x6c0/0x814 > > [ 5.552843] [<80300c44>] pci_scan_slot+0x140/0x148 > > [ 5.566957] [<80301dcc>] pci_scan_child_bus+0x50/0x1b0 > > [ 5.582096] [<80301944>] pci_scan_bridge+0x25c/0x694 > > [ 5.596724] [<80301e78>] pci_scan_child_bus+0xfc/0x1b0 > > [ 5.611862] [<80301944>] pci_scan_bridge+0x25c/0x694 > > [ 5.626488] [<80301e78>] pci_scan_child_bus+0xfc/0x1b0 > > [ 5.641628] [<8030215c>] pci_scan_root_bus+0x64/0x124 > > [ 5.656528] [<804ca298>] pcibios_scanbus+0xa8/0x188 > > > > I more than sure you are familiar with the issue, since I've found the > > mailing discussion: "PCI: avoid NULL deref in alloc_pcie_link_state" > > https://patchwork.kernel.org/patch/2751651/ > > https://bugzilla.kernel.org/show_bug.cgi?id=60111 > > > > You closed the bugzilla ticket with the next statement: > > "I'm closing this as invalid because the simulated machine where the problem > > occurs has an invalid PCIe topology (an Upstream Port with no Downstream Port > > or Root Port above it). As far as I know, there is no valid topology, e.g., > > a real hardware machine in the field, that would cause this failure." > > > > I'm strongly disagree with it, since I've got at least two hardware with > > PCIe-bus hierarchy as described in the mailing list. One of them is based on > > Cavium Octeon III CN7020. Here is a ASCII-diagram of PCIe-bus: > > Thanks for this information. I reopened that bugzilla; can you attach > complete dmesg logs and "lspci -vv" output for your systems? As I > mentioned in comment #4, I'm completely open to fixing this. My > objections at the time were (1) there was no known hardware that could > trigger the problem, and (2) the proposed fix was ugly and prone to > future breakage. Since we now have real systems that trip over this, > we need to revisit it. > > Bjorn > Done. Welcome back to the bugzilla thread. -Serge > > -+-[0000:01]---00.0-[02-06]--+-02.0-[03-05]--+-00.0-[04-05]----00.0-[05]-- > > | | \-00.1 Device [111d:808f] > > | \-04.0-[06]----00.0 Device [126f:0750] > > \-[0000:00]- > > > > where 01:00.0 is an Upstream port of IDT PCIe-swtich. > > / # /usr/local/sbin/lspci -v -s 01:00.0 > > 01:00.0 Class 0604: Device 111d:8061 > > Flags: bus master, fast devsel, latency 0 > > Memory at <unassigned> (32-bit, non-prefetchable) [size=2] > > Memory at <unassigned> (32-bit, non-prefetchable) [size=2] > > Bus: primary=01, secondary=02, subordinate=06, sec-latency=0 > > Memory behind bridge: 08000000-0dffffff > > Expansion ROM at <unassigned> [disabled] [size=2] > > Capabilities: [40] Express Upstream Port, MSI 00 > > Capabilities: [c0] Power Management version 3 > > Capabilities: [100] Advanced Error Reporting > > Capabilities: [200] Virtual Channel > > Kernel driver in use: pcieport > > > > As you can see PCI-bus hierarchy doesn't have root port and the very first > > upstream port is directly connected to Host-PCIe bridge of MCU, which of > > course is not listed by the lspci utility. > > > > Despite of Radim Kr?má?, who suggested a fix, which would de-facto just > > turned ASPM off, I found a quick solution, which disabled ASPM only in > > the first link (Host-PCIe=>Upstream port) of PCIe-bus for such hierarchy. > > ASPM for other PCIe-bus topologies shall work the way it was. > > > > I hope the fix will be helpful. > > Thanks, > > > > ============================= > > Serge V. Semin > > Leading Programmer > > Embedded SW development group > > T-platforms > > ============================= > > > > Signed-off-by: Serge Semin <fancer.lancer@xxxxxxxxx> > > > > --- > > drivers/pci/pcie/aspm.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index 0ec649d..a9295f29 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -522,7 +522,8 @@ static struct pcie_link_state *alloc_pcie_link_state(struct pci_dev *pdev) > > INIT_LIST_HEAD(&link->children); > > INIT_LIST_HEAD(&link->link); > > link->pdev = pdev; > > - if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT) { > > + if ((pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT) && > > + (!pci_is_root_bus(pdev->bus->parent))) { > > struct pcie_link_state *parent; > > parent = pdev->bus->parent->self->link_state; > > if (!parent) { > > -- > > 2.6.6 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html