[+cc Maciej for similar retrain issue] On Tue, Nov 08, 2022 at 04:29:44PM -0600, Bjorn Helgaas wrote: > On Thu, Jun 02, 2022 at 06:55:44AM +0000, Nathan Rossi wrote: > > From: Nathan Rossi <nathan.rossi@xxxxxxxx> > > > > When retraining the link either the child or the parent device may have > > the data link layer state machine of the respective devices move out of > > the active state despite the physical link training being completed. > > Depending on how long is takes for the devices to return to the active > > state, the device may not be ready and any further reads/writes to the > > device can fail. > > > > This issue is present with the pci-mvebu controller paired with a device > > supporting ASPM but without advertising the Slot Clock, where during > > boot the pcie_aspm_cap_init call would cause common clocks to be made > > consistent and then retrain the link. However the data link layer would > > not be active before any device initialization (e.g. ASPM capability > > queries, BAR configuration) causing improper configuration of the device > > without error. > > > > To ensure the child device is accessible, after the link retraining use > > pcie_wait_for_link to perform the associated state checks and any needed > > delays. > > > > Signed-off-by: Nathan Rossi <nathan.rossi@xxxxxxxx> > > --- > > drivers/pci/pcie/aspm.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index a96b7424c9..4b8a1810be 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -288,7 +288,8 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link) > > reg16 &= ~PCI_EXP_LNKCTL_CCC; > > pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); > > > > - if (pcie_retrain_link(link)) > > + /* Retrain link and then wait for the link to become active */ > > + if (pcie_retrain_link(link) && pcie_wait_for_link(parent, true)) > > pcie_retrain_link() waits for PCI_EXP_LNKSTA_LT (Link Training) to be > cleared, which means the LTSSM has exited the Configuration/Recovery > state. pcie_wait_for_link() waits for PCI_EXP_LNKSTA_DLLLA (Data Link > Layer Link Active) to be set, which means the link is in DL_Active. > > I don't see an explicit procedure in the spec for determining when > a link retrain is complete, but from PCIe r6.0, sec 6.2.11 (DPC): > > After software releases the Downstream Port from DPC, the Port’s > LTSSM must transition to the Detect state, where the Link will > attempt to retrain. Software can use Data Link Layer State Changed > interrupts, DL_ACTIVE ERR_COR signaling, or both, to signal when the > Link reaches the DL_Active state again. > > and sec 6.6: > > On the completion of Link Training (entering the DL_Active state, > see Section 3.2), a component must be able to receive and process > TLPs and DLLPs. > > The only use mentioned in the spec for the Link Training bit is the > implementation note in sec 7.5.3.7 about avoiding race conditions when > using the Retrain Link bit, where software should poll Link Training > until it returns to zero before setting the Retrain Link bit to change > link parameters. > > And I think you're absolutely right that what we *want* here is the > data link layer DL_Active state, not just the link layer L0 state. > > This all makes me think that checking the Link Training bit might be > the wrong thing to begin with. > > Of course, the Data Link Layer Link Active bit wasn't added until PCIe > r1.1, and even now it's optional. Without it, I don't know if there's > a way to make sure the link is in DL_Active. > > Maybe pcie_retrain_link() should wait for Data Link Layer Link Active > if it is supported, and use the existing behavior of waiting for Link > Training to be cleared otherwise? Nathan, I meant to cc you on this thread, which is doing something very similar. Just FYI: https://lore.kernel.org/all/alpine.DEB.2.21.2209130050380.60554@xxxxxxxxxxxxxxxxx/