Re: [PATCH v3] PCI: aardvark: Use LTSSM state to build link training flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 14, 2019 at 02:45:34PM +0100, Marc Zyngier wrote:
> Hi Remi,
> 
> On 2019-10-14 14:06, Remi Pommarel wrote:
> > Hi Lorenzo, Marc,
> > 
> > On Mon, Oct 14, 2019 at 11:01:29AM +0100, Lorenzo Pieralisi wrote:
> > > On Sun, Oct 13, 2019 at 11:34:15AM +0100, Marc Zyngier wrote:
> > > > On Tue, 1 Oct 2019 09:05:46 +0100
> > > > Andrew Murray <andrew.murray@xxxxxxx> wrote:
> > > >
> > > > Hi Lorenzo,
> > > >
> > > > > On Mon, Sep 30, 2019 at 06:52:30PM +0200, Remi Pommarel wrote:
> > > > > > On Mon, Sep 30, 2019 at 04:40:18PM +0100, Andrew Murray wrote:
> > > > > > > On Wed, May 22, 2019 at 11:33:51PM +0200, Remi Pommarel
> > > wrote:
> > > > > > > > Aardvark's PCI_EXP_LNKSTA_LT flag in its link status
> > > register is not
> > > > > > > > implemented and does not reflect the actual link training
> > > state (the
> > > > > > > > flag is always set to 0). In order to support link
> > > re-training feature
> > > > > > > > this flag has to be emulated. The Link Training and Status
> > > State
> > > > > > > > Machine (LTSSM) flag in Aardvark LMI config register could
> > > be used as
> > > > > > > > a link training indicator. Indeed if the LTSSM is in L0 or
> > > upper state
> > > > > > > > then link training has completed (see [1]).
> > > > > > > >
> > > > > > > > Unfortunately because after asking a link retraining it
> > > takes a while
> > > > > > > > for the LTSSM state to become less than 0x10 (due to L0s
> > > to recovery
> > > > > > > > state transition delays), LTSSM can still be in L0 while
> > > link training
> > > > > > > > has not finished yet. So this waits for link to be in
> > > recovery or lesser
> > > > > > > > state before returning after asking for a link retrain.
> > > > > > > >
> > > > > > > > [1] "PCI Express Base Specification", REV. 4.0
> > > > > > > >     PCI Express, February 19 2014, Table 4-14
> > > > > > > >
> > > > > > > > Signed-off-by: Remi Pommarel <repk@xxxxxxxxxxxx>
> > > > > > > > ---
> > > > > > > > Changes since v1:
> > > > > > > >   - Rename retraining flag field
> > > > > > > >   - Fix DEVCTL register writing
> > > > > > > >
> > > > > > > > Changes since v2:
> > > > > > > >   - Rewrite patch logic so it is more legible
> > > > > > > >
> > > > > > > > Please note that I will unlikely be able to answer any
> > > comments from May
> > > > > > > > 24th to June 10th.
> > > > > > > > ---
> > > > > > > >  drivers/pci/controller/pci-aardvark.c | 29
> > > ++++++++++++++++++++++++++-
> > > > > > > >  1 file changed, 28 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c
> > > b/drivers/pci/controller/pci-aardvark.c
> > > > > > > > index 134e0306ff00..8803083b2174 100644
> > > > > > > > --- a/drivers/pci/controller/pci-aardvark.c
> > > > > > > > +++ b/drivers/pci/controller/pci-aardvark.c
> > > > > > > > @@ -180,6 +180,8 @@
> > > > > > > >  #define LINK_WAIT_MAX_RETRIES		10
> > > > > > > >  #define LINK_WAIT_USLEEP_MIN		90000
> > > > > > > >  #define LINK_WAIT_USLEEP_MAX		100000
> > > > > > > > +#define RETRAIN_WAIT_MAX_RETRIES	10
> > > > > > > > +#define RETRAIN_WAIT_USLEEP_US		2000
> > > > > > > >
> > > > > > > >  #define MSI_IRQ_NUM			32
> > > > > > > >
> > > > > > > > @@ -239,6 +241,17 @@ static int
> > > advk_pcie_wait_for_link(struct advk_pcie *pcie)
> > > > > > > >  	return -ETIMEDOUT;
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static void advk_pcie_wait_for_retrain(struct advk_pcie
> > > *pcie)
> > > > > > > > +{
> > > > > > > > +	size_t retries;
> > > > > > > > +
> > > > > > > > +	for (retries = 0; retries < RETRAIN_WAIT_MAX_RETRIES;
> > > ++retries) {
> > > > > > > > +		if (!advk_pcie_link_up(pcie))
> > > > > > > > +			break;
> > > > > > > > +		udelay(RETRAIN_WAIT_USLEEP_US);
> > > > > > > > +	}
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  static void advk_pcie_setup_hw(struct advk_pcie *pcie)
> > > > > > > >  {
> > > > > > > >  	u32 reg;
> > > > > > > > @@ -426,11 +439,20 @@
> > > advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
> > > > > > > >  		return PCI_BRIDGE_EMUL_HANDLED;
> > > > > > > >  	}
> > > > > > > >
> > > > > > > > +	case PCI_EXP_LNKCTL: {
> > > > > > > > +		/* u32 contains both PCI_EXP_LNKCTL and PCI_EXP_LNKSTA
> > > */
> > > > > > > > +		u32 val = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + reg)
> > > &
> > > > > > > > +			~(PCI_EXP_LNKSTA_LT << 16);
> > > > > > >
> > > > > > > The commit message says "the flag is always set to 0" -
> > > therefore I guess
> > > > > > > you don't *need* to mask out the LT bit here? I assume this
> > > is just
> > > > > > > belt-and-braces but thought I'd check incase I've
> > > misunderstood or if your
> > > > > > > commit message is inaccurate.
> > > > > > >
> > > > > > > In any case masking out the bit (or adding a comment) makes
> > > this code more
> > > > > > > readable as the reader doesn't need to know what the
> > > hardware does with this
> > > > > > > bit.
> > > > > >
> > > > > > Actually vendor eventually responded that the bit was
> > > reserved, but
> > > > > > during my tests it remains to 0.
> > > > > >
> > > > > > So yes I am masking this out mainly for belt-and-braces and
> > > legibility.
> > > > >
> > > > > Thanks for the clarification.
> > > > >
> > > > > >
> > > > > > > > +		if (!advk_pcie_link_up(pcie))
> > > > > > > > +			val |= (PCI_EXP_LNKSTA_LT << 16);
> > > > > > > > +		*value = val;
> > > > > > > > +		return PCI_BRIDGE_EMUL_HANDLED;
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > >  	case PCI_CAP_LIST_ID:
> > > > > > > >  	case PCI_EXP_DEVCAP:
> > > > > > > >  	case PCI_EXP_DEVCTL:
> > > > > > > >  	case PCI_EXP_LNKCAP:
> > > > > > > > -	case PCI_EXP_LNKCTL:
> > > > > > > >  		*value = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + reg);
> > > > > > > >  		return PCI_BRIDGE_EMUL_HANDLED;
> > > > > > > >  	default:
> > > > > > > > @@ -447,8 +469,13 @@
> > > advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
> > > > > > > >
> > > > > > > >  	switch (reg) {
> > > > > > > >  	case PCI_EXP_DEVCTL:
> > > > > > > > +		advk_writel(pcie, new, PCIE_CORE_PCIEXP_CAP + reg);
> > > > > > > > +		break;
> > > > > > >
> > > > > > > Why is this here?
> > > > > > >
> > > > > >
> > > > > > Before PCI_EXP_DEVCTL and PCI_EXP_LNKCTL were doing the same
> > > thing, but
> > > > > > as now PCI_EXP_LNKCTL does extra things (i.e. wait for link to
> > > > > > successfully retrain), they do have different behaviours.
> > > > > >
> > > > > > So this is here so PCI_EXP_DEVCTL keeps its old behaviour and
> > > do not
> > > > > > wait for link retrain in case an unrelated (PCI_EXP_LNKCTL_RL)
> > > bit is
> > > > > > set.
> > > > >
> > > > > Oh yes, of course!
> > > > >
> > > > > Thanks and:
> > > > >
> > > > > Reviewed-by: Andrew Murray <andrew.murray@xxxxxxx>
> > > >
> > > > Is there any hope for this workaround to make it into 5.4?
> > > >
> > > > My EspressoBin (which is blessed with this joke of a PCIe
> > > controller)
> > > > is pretty much a doorstop without it and dies with a SError early
> > > at
> > > > boot.
> > > >
> > > > FWIW:
> > > >
> > > > Tested-by: Marc Zyngier <maz@xxxxxxxxxx>
> > > 
> > > Hi Marc,
> > > 
> > > First thing I will have to mark it as a Fixes: (if Remi can provide
> > > me with a tag that'd be great), usually we send fixes at -rc* for
> > > patches that fix code that went in the current (eg 5.4) material,
> > > I will ask Bjorn to see if we can send this in one of the upcoming
> > > -rc* even if it fixes older code.
> > 
> > Sure, I think this could be considered a fix for the following commit :
> > Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI
> > bridge config space")
> > 
> > Moreover, Marc, I am also a bit supprised that you did not have to use
> > [1] to even be able to boot.
> 
> No, I don't have that one, and yet the system boots fine (although PCI
> doesn't get much use on this box). I guess I'm lucky...
> 
> > Also if you want to be completely immune to this kind of SError (that
> > could theoretically happen if the link goes down for other reasons than
> > being retrained) you would have to use mainline ATF along with [2]. But
> > the chances to hit that are low (could only happen in case of link
> > errors).
> 
> Now you've got me worried. Can you point me to that ATF patch? I'm quite
> curious as to how you recover from an SError on a v8.0 CPU given that it
> has no syndrome information and may as well signal "CPU on fire!"...
> 

The patch is at [1]. Please note that this is done quite similarly for
rcar.

[1] https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541

-- 
Remi



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux