Re: [PATCH 1/1] PCI/ASPM: Handle link retraining race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 19, 2023 at 12:30:34PM +0300, Ilpo Järvinen wrote:
> On Thu, 18 May 2023, Bjorn Helgaas wrote:
> > On Tue, May 02, 2023 at 11:39:23AM +0300, Ilpo Järvinen wrote:
> > > Implementation Note at the end of PCIe r6.0.1 sec 7.5.3.7 recommends
> > > handling LTSSM race to ensure link retraining acquires correct
> > > parameters from the LNKCTL register. According to the implementation
> > > note, LTSSM might transition into Recovery or Configuration state
> > > independently of the driver requesting it, and if retraining due to
> > > such an event is still ongoing, the value written into the LNKCTL
> > > register might not be considered by the link retraining.
> > > 
> > > Ensure link training bit is clear before toggling link retraining bit
> > > to meet the requirements of the Implementation Note.
> > > 
> > > Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support")
> > > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx>
> > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> > > Reviewed-by: Lukas Wunner <lukas@xxxxxxxxx>
> > > Cc: stable@xxxxxxxxxxxxxxx
> > 
> > Thanks for this!
> > 
> > The existing pcie_retrain_link() and pcie_wait_for_retrain() both
> > return bool, but neither is named as a predicate, and it's always a
> > little hard for me to keep track of what the true/false return values
> > mean.
> > 
> > I propose tweaking them so they both return 0 for success or
> > -ETIMEDOUT for failure.  What do you think?  It does make the patch
> > bigger, which is kind of unfortunate.
> 
> It's better, yes, unless stable folks think it's not a minimal change.
> 
> As a confirmation for your return tweak improving things, I recall that I 
> had to be careful with the bool in this case for the reasons you mention 
> (it requires more mental capacity and verification which way the return 
> is).
> 
> (Also, expect the error handling reindent to cause a conflict with the RMW 
> series.)

OK, thanks for taking a look!  I applied the revised patch to pci/aspm
for v6.5.  I'll take care of any conflicts with the RMW series.

> > commit f55ef626b57f ("PCI/ASPM: Avoid link retraining race")
> > parent e8d05f522fae
> > Author: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> > Date:   Tue May 2 11:39:23 2023 +0300
> > 
> >     PCI/ASPM: Avoid link retraining race
> >     
> >     PCIe r6.0.1, sec 7.5.3.7, recommends setting the link control parameters,
> >     then waiting for the Link Training bit to be clear before setting the
> >     Retrain Link bit.
> >     
> >     This avoids a race where the LTSSM may not use the updated parameters if it
> >     is already in the midst of link training because of other normal link
> >     activity.
> >     
> >     Wait for the Link Training bit to be clear before toggling the Retrain Link
> >     bit to ensure that the LTSSM uses the updated link control parameters.
> >     
> >     [bhelgaas: commit log, return 0 (success)/-ETIMEDOUT instead of bool for
> >     both pcie_wait_for_retrain() and the existing pcie_retrain_link()]
> >     Suggested-by: Lukas Wunner <lukas@xxxxxxxxx>
> >     Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support")
> >     Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@xxxxxxxxxxxxxxx
> >     Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> >     Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> >     Reviewed-by: Lukas Wunner <lukas@xxxxxxxxx>
> >     Cc: stable@xxxxxxxxxxxxxxx
> > 
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 72cdb30a924a..3aa73ecdf86f 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -193,12 +193,39 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
> >  	link->clkpm_disable = blacklist ? 1 : 0;
> >  }
> >  
> > -static bool pcie_retrain_link(struct pcie_link_state *link)
> > +static int pcie_wait_for_retrain(struct pci_dev *pdev)
> >  {
> > -	struct pci_dev *parent = link->pdev;
> >  	unsigned long end_jiffies;
> >  	u16 reg16;
> >  
> > +	/* Wait for Link Training to be cleared by hardware */
> > +	end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT;
> > +	do {
> > +		pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &reg16);
> > +		if (!(reg16 & PCI_EXP_LNKSTA_LT))
> > +			return 0;
> > +		msleep(1);
> > +	} while (time_before(jiffies, end_jiffies));
> > +
> > +	return -ETIMEDOUT;
> > +}
> > +
> > +static int pcie_retrain_link(struct pcie_link_state *link)
> > +{
> > +	struct pci_dev *parent = link->pdev;
> > +	int rc;
> > +	u16 reg16;
> > +
> > +	/*
> > +	 * Ensure the updated LNKCTL parameters are used during link
> > +	 * training by checking that there is no ongoing link training to
> > +	 * avoid LTSSM race as recommended in Implementation Note at the
> > +	 * end of PCIe r6.0.1 sec 7.5.3.7.
> > +	 */
> > +	rc = pcie_wait_for_retrain(parent);
> > +	if (rc)
> > +		return rc;
> > +
> >  	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
> >  	reg16 |= PCI_EXP_LNKCTL_RL;
> >  	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> > @@ -212,15 +239,7 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
> >  		pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> >  	}
> >  
> > -	/* Wait for link training end. Break out after waiting for timeout */
> > -	end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT;
> > -	do {
> > -		pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &reg16);
> > -		if (!(reg16 & PCI_EXP_LNKSTA_LT))
> > -			break;
> > -		msleep(1);
> > -	} while (time_before(jiffies, end_jiffies));
> > -	return !(reg16 & PCI_EXP_LNKSTA_LT);
> > +	return pcie_wait_for_retrain(parent);
> >  }
> >  
> >  /*
> > @@ -289,15 +308,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
> >  		reg16 &= ~PCI_EXP_LNKCTL_CCC;
> >  	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> >  
> > -	if (pcie_retrain_link(link))
> > -		return;
> > +	if (pcie_retrain_link(link)) {
> >  
> > -	/* Training failed. Restore common clock configurations */
> > -	pci_err(parent, "ASPM: Could not configure common clock\n");
> > -	list_for_each_entry(child, &linkbus->devices, bus_list)
> > -		pcie_capability_write_word(child, PCI_EXP_LNKCTL,
> > +		/* Training failed. Restore common clock configurations */
> > +		pci_err(parent, "ASPM: Could not configure common clock\n");
> > +		list_for_each_entry(child, &linkbus->devices, bus_list)
> > +			pcie_capability_write_word(child, PCI_EXP_LNKCTL,
> >  					   child_reg[PCI_FUNC(child->devfn)]);
> > -	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg);
> > +		pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg);
> > +	}
> >  }
> >  
> >  /* Convert L0s latency encoding to ns */




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux