On Fri, May 19, 2023 at 12:30:34PM +0300, Ilpo Järvinen wrote: > On Thu, 18 May 2023, Bjorn Helgaas wrote: > > On Tue, May 02, 2023 at 11:39:23AM +0300, Ilpo Järvinen wrote: > > > Implementation Note at the end of PCIe r6.0.1 sec 7.5.3.7 recommends > > > handling LTSSM race to ensure link retraining acquires correct > > > parameters from the LNKCTL register. According to the implementation > > > note, LTSSM might transition into Recovery or Configuration state > > > independently of the driver requesting it, and if retraining due to > > > such an event is still ongoing, the value written into the LNKCTL > > > register might not be considered by the link retraining. > > > > > > Ensure link training bit is clear before toggling link retraining bit > > > to meet the requirements of the Implementation Note. > > > > > > Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") > > > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx> > > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > > Reviewed-by: Lukas Wunner <lukas@xxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx > > > > Thanks for this! > > > > The existing pcie_retrain_link() and pcie_wait_for_retrain() both > > return bool, but neither is named as a predicate, and it's always a > > little hard for me to keep track of what the true/false return values > > mean. > > > > I propose tweaking them so they both return 0 for success or > > -ETIMEDOUT for failure. What do you think? It does make the patch > > bigger, which is kind of unfortunate. > > It's better, yes, unless stable folks think it's not a minimal change. > > As a confirmation for your return tweak improving things, I recall that I > had to be careful with the bool in this case for the reasons you mention > (it requires more mental capacity and verification which way the return > is). > > (Also, expect the error handling reindent to cause a conflict with the RMW > series.) OK, thanks for taking a look! I applied the revised patch to pci/aspm for v6.5. I'll take care of any conflicts with the RMW series. > > commit f55ef626b57f ("PCI/ASPM: Avoid link retraining race") > > parent e8d05f522fae > > Author: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > Date: Tue May 2 11:39:23 2023 +0300 > > > > PCI/ASPM: Avoid link retraining race > > > > PCIe r6.0.1, sec 7.5.3.7, recommends setting the link control parameters, > > then waiting for the Link Training bit to be clear before setting the > > Retrain Link bit. > > > > This avoids a race where the LTSSM may not use the updated parameters if it > > is already in the midst of link training because of other normal link > > activity. > > > > Wait for the Link Training bit to be clear before toggling the Retrain Link > > bit to ensure that the LTSSM uses the updated link control parameters. > > > > [bhelgaas: commit log, return 0 (success)/-ETIMEDOUT instead of bool for > > both pcie_wait_for_retrain() and the existing pcie_retrain_link()] > > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx> > > Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") > > Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@xxxxxxxxxxxxxxx > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> > > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > Reviewed-by: Lukas Wunner <lukas@xxxxxxxxx> > > Cc: stable@xxxxxxxxxxxxxxx > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index 72cdb30a924a..3aa73ecdf86f 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -193,12 +193,39 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) > > link->clkpm_disable = blacklist ? 1 : 0; > > } > > > > -static bool pcie_retrain_link(struct pcie_link_state *link) > > +static int pcie_wait_for_retrain(struct pci_dev *pdev) > > { > > - struct pci_dev *parent = link->pdev; > > unsigned long end_jiffies; > > u16 reg16; > > > > + /* Wait for Link Training to be cleared by hardware */ > > + end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT; > > + do { > > + pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, ®16); > > + if (!(reg16 & PCI_EXP_LNKSTA_LT)) > > + return 0; > > + msleep(1); > > + } while (time_before(jiffies, end_jiffies)); > > + > > + return -ETIMEDOUT; > > +} > > + > > +static int pcie_retrain_link(struct pcie_link_state *link) > > +{ > > + struct pci_dev *parent = link->pdev; > > + int rc; > > + u16 reg16; > > + > > + /* > > + * Ensure the updated LNKCTL parameters are used during link > > + * training by checking that there is no ongoing link training to > > + * avoid LTSSM race as recommended in Implementation Note at the > > + * end of PCIe r6.0.1 sec 7.5.3.7. > > + */ > > + rc = pcie_wait_for_retrain(parent); > > + if (rc) > > + return rc; > > + > > pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); > > reg16 |= PCI_EXP_LNKCTL_RL; > > pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); > > @@ -212,15 +239,7 @@ static bool pcie_retrain_link(struct pcie_link_state *link) > > pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); > > } > > > > - /* Wait for link training end. Break out after waiting for timeout */ > > - end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT; > > - do { > > - pcie_capability_read_word(parent, PCI_EXP_LNKSTA, ®16); > > - if (!(reg16 & PCI_EXP_LNKSTA_LT)) > > - break; > > - msleep(1); > > - } while (time_before(jiffies, end_jiffies)); > > - return !(reg16 & PCI_EXP_LNKSTA_LT); > > + return pcie_wait_for_retrain(parent); > > } > > > > /* > > @@ -289,15 +308,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link) > > reg16 &= ~PCI_EXP_LNKCTL_CCC; > > pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); > > > > - if (pcie_retrain_link(link)) > > - return; > > + if (pcie_retrain_link(link)) { > > > > - /* Training failed. Restore common clock configurations */ > > - pci_err(parent, "ASPM: Could not configure common clock\n"); > > - list_for_each_entry(child, &linkbus->devices, bus_list) > > - pcie_capability_write_word(child, PCI_EXP_LNKCTL, > > + /* Training failed. Restore common clock configurations */ > > + pci_err(parent, "ASPM: Could not configure common clock\n"); > > + list_for_each_entry(child, &linkbus->devices, bus_list) > > + pcie_capability_write_word(child, PCI_EXP_LNKCTL, > > child_reg[PCI_FUNC(child->devfn)]); > > - pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg); > > + pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg); > > + } > > } > > > > /* Convert L0s latency encoding to ns */