On Fri, Mar 29, 2019 at 06:07:33PM +0100, Stefan Mätje wrote: > This patch provides a quirk that works around PCIe link retrain issues > with some Pericom PCIe-to-PCI bridges. > > This patches should be backported to the stable kernels ([1/3] and [2/3]). The > problem shows up since our customers use kernels 4.10 and later. > > The original patch was only for the Pericom PI7C9X111SL bridge. This is the > only hardware I can test with. > > In the meantime Brett Hull <bhull@xxxxxxxxxx> brought to my attention that > the PI7C9X110 and PI7C9X130 devices are also affected. I had access to the > errata sheet for the PI7C9X130 PCI bridge and its has the same issue > documented. > > Therefore the patch will now handle all three mentioned devices. > > I'd like to quote the errata sheet PI7C9X111SLB_errata_rev1.2_102711.pdf > > > In Reverse Mode, retrain Link bit is not cleared automatically; this bit > > needs to be cleared manually by configuration write after it is set. > > > > Problem: > > In Reverse mode, after setting Retrain Link (bit 5 of register C0h), this > > bit will stay on and PI7C9x111SL will continuously retrain until this bit > > is cleared by another Configuration Write to register C0h. > > > > Workaround: > > Issue another configuration write to clear Retrain Link bit after setting > > this bit. No delay is required between these two configuration write. > > There is no public URL to download these errata sheets. Because Pericom has > been acquired by Diodes Inc. all information has to be downloaded from their > web site. Following the link below one can find a datasheet and there is a > button to request additional documents like the errata sheet for instance. > > https://www.diodes.com/products/connectivity-and-timing/pcie-packet-switchbridges/pcie-pci-bridges/part/PI7C9X111SL#tab-details > > I'm still using msleep(1) here because I don't want to change the timing for > other parts of the kernel unintentionally. > Some notes regarding this: > > a) The link retraining is fast. The time measured on two different PCs and PCIe > bridges is below 5µs. > b) My slower test PC needs ~2.5µs for a checking the PCI_EXP_LNKSTA register the > faster test PC needs ~1.5µs for that. So it comes the slower test PC runs > through the wait loop without calling msleep(1) but the faster machine needs > to wait once with msleep(1). > In this case waiting with udelay(10) would perhaps be more appropriate but > I didn't want to burden this pure bug fix patch with a timing change that > possibly needs discussion. > > History: > [V3] > - Split into more patches to make backporting easier. > - Applied Andy Shevchenko's recommendations regarding wording and > code style. > [V2] > https://lore.kernel.org/linux-pci/20190305173122.11875-1-stefan.maetje@xxxxxx/ > [V1] > https://lore.kernel.org/linux-pci/20181101192229.48352-2-stefan.maetje@xxxxxx/ > > > Stefan Mätje (3): > PCI/ASPM: Prepare stand-alone pcie_retrain_link() function > PCI/ASPM: Work around link retrain errata of Pericom PCIe-to-PCI > bridges > PCI/ASPM: Trivial rework wait code in pcie_retrain_link() > > drivers/pci/pcie/aspm.c | 47 ++++++++++++++++++++++++++++++++--------------- > drivers/pci/quirks.c | 20 ++++++++++++++++++++ > include/linux/pci.h | 2 ++ > 3 files changed, 54 insertions(+), 15 deletions(-) Thanks, applied to pci/enumeration for v5.2. I added stable tags for the first two patches. I also moved the quirk and related bits out from under CONFIG_PCIEASPM because I don't think the erratum is actually related to ASPM; it just happens that ASPM is the only place we currently retrain links. But we *could* retrain links for other reasons, e.g., changing link speed or error recovery, and we would also want the quirk then.