Re: [RFC] ASPM L1 link latencies

Ian Kumlien <ian.kumlien@xxxxxxxxx> · Thu, 30 Jul 2020 01:02:02 +0200

On Thu, Jul 30, 2020 at 12:47 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> On Sat, Jul 25, 2020 at 09:47:05PM +0200, Ian Kumlien wrote:
> > Hi,
> >
> > A while ago I realised that I was having all kinds off issues with my
> > connection, ~933 mbit had become ~40 mbit
> >
> > This only applied on links to the internet (via a linux fw running
> > NAT) however while debugging with the help of Alexander Duyck
> > we realised that ASPM could be the culprit (at least disabling ASPM on
> > the nic it self made things work just fine)...
> >
> > So while trying to understand PCIe and such things, I found this:
> >
> > The calculations of the max delay looked at "that node" + start latency * "hops"
> >
> > But one hop might have a larger latency and break the acceptable delay...
> >
> > So after a lot playing around with the code, i ended up with this, and
> > it seems to fix my problem and does
> > set two pcie bridges to ASPM Disabled that didn't happen before.
> >
> > I do however have questions.... Shouldn't the change be applied to
> > the endpoint?  Or should it be applied recursively along the path to
> > the endpoint?
>
> I don't understand this very well, but I think we do need to consider
> the latencies along the entire path.  PCIe r5.0, sec 5.4.1.3, contains
> this:
>
>   Power management software, using the latency information reported by
>   all components in the Hierarchy, can enable the appropriate level of
>   ASPM by comparing exit latency for each given path from Root to
>   Endpoint against the acceptable latency that each corresponding
>   Endpoint can withstand.

One of the questions is this:
They say from root to endpoint while we walk from endpoint to root

So, is that more optimal in some way? or should latencies always be
considered from root to endpoint?
In that case, should the link ASPM be disabled somewhere else?
(I tried to disable them on the "endpoint" and it didn't help for some reason)

> Also this:

[--8<--]

>   - For each component with one or more Endpoint Functions, PCI
>     Express system software examines the Endpoint L0s/L1 Acceptable
>     Latency, as reported by each Endpoint Function in its Link
>     Capabilities register, and enables or disables L0s/L1 entry (via
>     the ASPM Control field in the Link Control register) accordingly
>     in some or all of the intervening device Ports on that hierarchy.

> > Also, the L0S checks are only done on the local links, is this
> > correct?
>
> ASPM configuration is done on both ends of a link.  I'm not sure it
> makes sense to enable any state (L0s, L1, L1.1, L1.2) unless both ends
> of the link support it.  In particular, sec 5.4.1.3 says:
>
>   Software must not enable L0s in either direction on a given Link
>   unless components on both sides of the Link each support L0s;
>   otherwise, the result is undefined.
>
> But I think we do need to consider the entire path when enabling L0s;
> from sec 7.5.3.3:
>
>   Endpoint L0s Acceptable Latency - This field indicates the
>   acceptable total latency that an Endpoint can withstand due to the
>   transition from L0s state to the L0 state. It is essentially an
>   indirect measure of the Endpoint’s internal buffering.  Power
>   management software uses the reported L0s Acceptable Latency number
>   to compare against the L0s exit latencies reported by all components
>   comprising the data path from this Endpoint to the Root Complex Root
>   Port to determine whether ASPM L0s entry can be used with no loss of
>   performance.
>
> Does any of that help answer your question?

Yes! It's exactly what I wanted to know, :)

So now the question is should I group the fixes into one patch or
separate them for easier bisecting?

> Bjorn
>
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index b17e5ffd31b1..bd53fba7f382 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -434,7 +434,7 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
> >
> >  static void pcie_aspm_check_latency(struct pci_dev *endpoint)
> >  {
> > -       u32 latency, l1_switch_latency = 0;
> > +       u32 latency, l1_max_latency = 0, l1_switch_latency = 0;
> >         struct aspm_latency *acceptable;
> >         struct pcie_link_state *link;
> >
> > @@ -470,8 +470,9 @@ static void pcie_aspm_check_latency(struct pci_dev
> > *endpoint)
> >                  * substate latencies (and hence do not do any check).
> >                  */
> >                 latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
> > +               l1_max_latency = max_t(u32, latency, l1_max_latency);
> >                 if ((link->aspm_capable & ASPM_STATE_L1) &&
> > -                   (latency + l1_switch_latency > acceptable->l1))
> > +                   (l1_max_latency + l1_switch_latency > acceptable->l1))
> >                         link->aspm_capable &= ~ASPM_STATE_L1;
> >                 l1_switch_latency += 1000;