Re: [PATCH] Use maximum latency when determining L1/L0s ASPM v2

Alexander Duyck <alexander.duyck@xxxxxxxxx> · Mon, 28 Sep 2020 16:30:49 -0700

On Mon, Sep 28, 2020 at 1:33 PM Ian Kumlien <ian.kumlien@xxxxxxxxx> wrote:
>
> On Mon, Sep 28, 2020 at 10:04 PM Ian Kumlien <ian.kumlien@xxxxxxxxx> wrote:
> >
> > On Mon, Sep 28, 2020 at 9:53 PM Alexander Duyck
> > <alexander.duyck@xxxxxxxxx> wrote:

<snip>

> > > You should be able to manually disable L1 on the realtek link
> > > (4:00.0<->2:04.0) instead of doing it on the upstream link on the
> > > switch. That may provide a datapoint on the L1 behavior of the setup.
> > > Basically if you took the realtek out of the equation in terms of the
> > > L1 exit time you should see the exit time drop to no more than 33us
> > > like what would be expected with just the i210.
> >
> > Yeah, will try it out with echo 0 >
> > /sys/devices/pci0000:00/0000:00:01.2/0000:01:00.0/0000:02:04.0/0000:04:00.0/link/l1_aspm
> > (which is the device reported by my patch)
>
> So, 04:00.0 is already disabled, the existing code apparently handled
> that correctly... *but*
>
> given the path:
> 00:01.2/01:00.0/02:04.0/04:00.0 Unassigned class [ff00]: Realtek
> Semiconductor Co., Ltd. Device 816e (rev 1a)
>
> Walking backwards:
> -- 04:00.0 has l1 disabled
> -- 02:04.0 doesn't have aspm?!
>
> lspci reports:
> Capabilities: [370 v1] L1 PM Substates
> L1SubCap: PCI-PM_L1.2- PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1+ L1_PM_Substates+
> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
> L1SubCtl2:
> Capabilities: [400 v1] Data Link Feature <?>
> Capabilities: [410 v1] Physical Layer 16.0 GT/s <?>
> Capabilities: [440 v1] Lane Margining at the Receiver <?>
>
> However the link directory is empty.
>
> Anything we should know about these unknown capabilities? also aspm
> L1.1 and .1.2, heh =)
>
> -- 01:00.0 has L1, disabling it makes the intel nic work again

I recall that much. However the question is why? If there is already a
32us time to bring up the link between the NIC and the switch why
would the additional 1us to also bring up the upstream port have that
much of an effect? That is why I am thinking that it may be worthwhile
to try to isolate things further so that only the upstream port and
the NIC have L1 enabled. If we are still seeing issues in that state
then I can only assume there is something off with the
00:01.2<->1:00.0 link to where it either isn't advertising the actual
L1 recovery time. For example the "Width x4 (downgraded)" looks very
suspicious and could be responsible for something like that if the
link training is having to go through exception cases to work out the
x4 link instead of a x8.

> ASPM L1 enabled:
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  5.40 MBytes  45.3 Mbits/sec    0   62.2 KBytes
> [  5]   1.00-2.00   sec  4.47 MBytes  37.5 Mbits/sec    0   70.7 KBytes
> [  5]   2.00-3.00   sec  4.10 MBytes  34.4 Mbits/sec    0   42.4 KBytes
> [  5]   3.00-4.00   sec  4.47 MBytes  37.5 Mbits/sec    0   65.0 KBytes
> [  5]   4.00-5.00   sec  4.47 MBytes  37.5 Mbits/sec    0    105 KBytes
> [  5]   5.00-6.00   sec  4.47 MBytes  37.5 Mbits/sec    0   84.8 KBytes
> [  5]   6.00-7.00   sec  4.47 MBytes  37.5 Mbits/sec    0   65.0 KBytes
> [  5]   7.00-8.00   sec  4.10 MBytes  34.4 Mbits/sec    0   45.2 KBytes
> [  5]   8.00-9.00   sec  4.47 MBytes  37.5 Mbits/sec    0   56.6 KBytes
> [  5]   9.00-10.00  sec  4.47 MBytes  37.5 Mbits/sec    0   48.1 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  44.9 MBytes  37.7 Mbits/sec    0             sender
> [  5]   0.00-10.01  sec  44.0 MBytes  36.9 Mbits/sec                  receiver
>
> ASPM L1 disabled:
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec   111 MBytes   935 Mbits/sec  733    761 KBytes
> [  5]   1.00-2.00   sec   110 MBytes   923 Mbits/sec  733    662 KBytes
> [  5]   2.00-3.00   sec   109 MBytes   912 Mbits/sec  1036   1.20 MBytes
> [  5]   3.00-4.00   sec   109 MBytes   912 Mbits/sec  647    738 KBytes
> [  5]   4.00-5.00   sec   110 MBytes   923 Mbits/sec  852    744 KBytes
> [  5]   5.00-6.00   sec   109 MBytes   912 Mbits/sec  546    908 KBytes
> [  5]   6.00-7.00   sec   109 MBytes   912 Mbits/sec  303    727 KBytes
> [  5]   7.00-8.00   sec   109 MBytes   912 Mbits/sec  432    769 KBytes
> [  5]   8.00-9.00   sec   110 MBytes   923 Mbits/sec  462    652 KBytes
> [  5]   9.00-10.00  sec   109 MBytes   912 Mbits/sec  576    764 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  1.07 GBytes   918 Mbits/sec  6320             sender
> [  5]   0.00-10.01  sec  1.06 GBytes   912 Mbits/sec                  receiver
>
> (all measurements are over live internet - so thus variance)

I forgot there were 5 total devices that were hanging off of there as
well. You might try checking to see if disabling L1 on devices 5:00.0,
6:00.0 and/or 7:00.0 has any effect while leaving the L1 on 01:00.0
and the NIC active. The basic idea is to go through and make certain
we aren't seeing an L1 issue with one of the other downstream links on
the switch.

The more I think about it the entire setup for this does seem a bit
suspicious. I was looking over the lspci tree and the dump from the
system. From what I can tell the upstream switch link at 01.2 <->
1:00.0 is only a Gen4 x4 link. However coming off of that is 5
devices, two NICs using either Gen1 or 2 at x1, and then a USB
controller and 2 SATA controller reporting Gen 4 x16. Specifically
those last 3 devices have me a bit curious as they are all reporting
L0s and L1 exit latencies that are the absolute minimum which has me
wondering if they are even reporting actual values.