On 1/12/25 00:00, Maciej W. Rozycki wrote: > On Fri, 10 Jan 2025, Jiwei Sun wrote: > >> In order to fix the issue, don't do the retraining work except ASMedia >> ASM2824. > > I yet need to go through all of your submission in detail, but this > assumption defeats the purpose of the workaround, as the current > understanding of the origin of the training failure and the reason to > retrain by hand with the speed limited to 2.5GT/s is the *downstream* > device rather than the ASMedia ASM2824 switch. > > It is also why the quirk has been wired to run everywhere rather than > having been keyed by VID:DID, and the VID:DID of the switch is only > listed, conservatively, because it seems safe with the switch to lift the > speed restriction once the link has successfully completed training. > > Overall I think we need to get your problem sorted differently, because I > suppose in principle your hot-plug scenario could also happen with the > ASMedia ASM2824 switch as the upstream device and your NVMe storage > element as the downstream device. Perhaps the speed restriction could be > always lifted, and then the bandwidth controller infrastructure used for > that, so that it doesn't have to happen within `pcie_failed_link_retrain'? According to our test, the following modification can fix the issue in our test machine. diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 02d2e16672a8..9ca051b86878 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -97,10 +97,6 @@ static bool pcie_lbms_seen(struct pci_dev *dev, u16 lnksta) */ int pcie_failed_link_retrain(struct pci_dev *dev) { - static const struct pci_device_id ids[] = { - { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */ - {} - }; u16 lnksta, lnkctl2; int ret = -ENOTTY; @@ -128,8 +124,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev) } if ((lnksta & PCI_EXP_LNKSTA_DLLLA) && - (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT && - pci_match_id(ids, dev)) { + (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == + PCI_EXP_LNKCTL2_TLS_2_5GT) { u32 lnkcap; pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n"); But I don't know if the above modification will have any other negative effects on other devices. Could you please share your thoughts? Thanks, Regards, Jiwei > > Maciej