On Thu, Feb 15, 2024 at 04:55:47PM +0530, Vidya Sagar wrote: > On 15-02-2024 00:42, Bjorn Helgaas wrote: > > Hi Vidya, question about ancient history: > > > > On Tue, Aug 13, 2019 at 05:06:27PM +0530, Vidya Sagar wrote: > > > Add support for Synopsys DesignWare core IP based PCIe host controller > > > present in Tegra194 SoC. > > > ... > > > +static int tegra_pcie_dw_host_init(struct pcie_port *pp) > > > +{ > > > + struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > > > + struct tegra_pcie_dw *pcie = to_tegra_pcie(pci); > > > + u32 val, tmp, offset, speed; > > > + > > > + tegra_pcie_prepare_host(pp); > > > + > > > + if (dw_pcie_wait_for_link(pci)) { > > > + /* > > > + * There are some endpoints which can't get the link up if > > > + * root port has Data Link Feature (DLF) enabled. > > > + * Refer Spec rev 4.0 ver 1.0 sec 3.4.2 & 7.7.4 for more info > > > + * on Scaled Flow Control and DLF. > > > + * So, need to confirm that is indeed the case here and attempt > > > + * link up once again with DLF disabled. > > > > This comment suggests that there's an issue with *Endpoints*, not an > > issue with the Root Port. If so, it seems like this problem could > > occur with all Root Ports, not just Tegra194. Do you remember any > > details about this? > > > > I don't remember hearing about any similar issues, and this driver is > > the only place PCI_EXT_CAP_ID_DLF is referenced, so maybe it is > > actually something related to Tegra194? > > We noticed PCIe link-up issues with some endpoints. link-up at the physical > layer level but NOT at the Data link layer level precisely. We further > figured out that it is the DLFE DLLPs that the root port sends during the > link up process which are causing the endpoints get confused and preventing > them from sending the InitFC DLLPs leading to the link not being up at > Data Link Layer level. Do you happen to remember any of the endpoints that have issues? Could save some painful debugging if we trip over this issue on other systems. We have seen a few cases where links wouldn't train at full speed unless they trained at a lower speed first, e.g., imx6_pcie_start_link(), fu740_pcie_start_link(). I guess there are probably lots of edge cases that can cause link failures. Bjorn