From: Jiwei Sun <sunjw10@xxxxxxxxxx> When we do the quick hot-add/hot-remove test (within 1 second) with a PCIE Gen 5 NVMe disk, there is a possibility that the PCIe bridge will decrease to 2.5GT/s from 32GT/s pcieport 10002:00:04.0: pciehp: Slot(75): Link Down pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found ... pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: broken device, retraining non-functional downstream link at 2.5GT/s pcieport 10002:00:04.0: pciehp: Slot(75): No link pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): Link Up pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pcieport 10002:00:04.0: pciehp: Slot(75): No device found pcieport 10002:00:04.0: pciehp: Slot(75): Card present pci 10002:02:00.0: [144d:a826] type 00 class 0x010802 PCIe Endpoint pci 10002:02:00.0: BAR 0 [mem 0x00000000-0x00007fff 64bit] pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x00007fff 64bit] pci 10002:02:00.0: VF BAR 0 [mem 0x00000000-0x001fffff 64bit]: contains BAR 0 for 64 VFs pci 10002:02:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 10002:00:04.0 (capable of 126.028 Gb/s with 32.0 GT/s PCIe x4 link) If a NVMe disk is hot removed, the pciehp interrupt will be triggered, and the kernel thread pciehp_ist will be woken up, the pcie_failed_link_retrain() will be called as the following call trace. irq/87-pciehp-2524 [121] ..... 152046.006765: pcie_failed_link_retrain <-pcie_wait_for_link irq/87-pciehp-2524 [121] ..... 152046.006782: <stack trace> => [FTRACE TRAMPOLINE] => pcie_failed_link_retrain => pcie_wait_for_link => pciehp_check_link_status => pciehp_enable_slot => pciehp_handle_presence_or_link_change => pciehp_ist => irq_thread_fn => irq_thread => kthread => ret_from_fork => ret_from_fork_asm Accorind to investigation, the issue is caused by the following scenerios, NVMe disk pciehp hardirq hot-remove top-half pciehp irq kernel thread ====================================================================== pciehp hardirq will be triggered cpu handle pciehp hardirq pciehp irq kthread will be woken up pciehp_ist ... pcie_failed_link_retrain read PCI_EXP_LNKCTL2 register read PCI_EXP_LNKSTA register If NVMe disk hot-add before calling pcie_retrain_link() set target speed to 2_5GT pcie_bwctrl_change_speed pcie_retrain_link : the retrain work will be successful, because pci_match_id() will be 0 in pcie_failed_link_retrain() the target link speed field of the Link Control 2 Register will keep 0x1. In order to fix the issue, don't do the retraining work except ASMedia ASM2824. Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures") Reported-by: Adrian Huang <ahuang12@xxxxxxxxxx> Signed-off-by: Jiwei Sun <sunjw10@xxxxxxxxxx> --- drivers/pci/quirks.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 605628c810a5..ff04ebd9ae16 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -104,6 +104,9 @@ int pcie_failed_link_retrain(struct pci_dev *dev) u16 lnksta, lnkctl2; int ret = -ENOTTY; + if (!pci_match_id(ids, dev)) + return 0; + if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) || !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting) return ret; @@ -129,8 +132,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev) } if ((lnksta & PCI_EXP_LNKSTA_DLLLA) && - (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT && - pci_match_id(ids, dev)) { + (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT) { u32 lnkcap; pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n"); -- 2.34.1