On Wed, Sep 23, 2020 at 11:23 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Wed, Sep 23, 2020 at 01:29:18AM +0200, Ian Kumlien wrote: > > On Wed, Sep 23, 2020 at 1:00 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > On Tue, Sep 22, 2020 at 11:02:35PM +0200, Ian Kumlien wrote: > > > > > commit db3d9c4baf4ab177d87b5cd41f624f5901e7390f > > > > Author: Ian Kumlien <ian.kumlien@xxxxxxxxx> > > > > Date: Sun Jul 26 16:01:15 2020 +0200 > > > > > > > > Use maximum latency when determining L1 ASPM > > > > > > > > If it's not, we clear the link for the path that had too large latency. > > > > > > > > Currently we check the maximum latency of upstream and downstream > > > > per link, not the maximum for the path > > > > > > > > This would work if all links have the same latency, but: > > > > endpoint -> c -> b -> a -> root (in the order we walk the path) > > > > > > > > If c or b has the higest latency, it will not register > > > > > > > > Fix this by maintaining the maximum latency value for the path > > > > > > > > See this bugzilla for more information: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=208741 > > > > > > > > This fixes an issue for me where my desktops machines maximum bandwidth > > > > for remote connections dropped from 933 MBit to ~40 MBit. > > > > > > > > The bug became obvious once we enabled ASPM on all links: > > > > 66ff14e59e8a (PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges) > > > > > > I can't connect the dots here yet. I don't see a PCIe-to-PCI/PCI-X > > > bridge in your lspci, so I can't figure out why this commit would make > > > a difference for you. > > > > > > IIUC, the problem device is 03:00.0, the Intel I211 NIC. Here's the > > > path to it: > > > > > > 00:01.2 Root Port to [bus 01-07] > > > 01:00.0 Switch Upstream Port to [bus 02-07] > > > 02:03.0 Switch Downstream Port to [bus 03] > > > 03:00.0 Endpoint (Intel I211 NIC) > > > > > > And I think this is the relevant info: > > > > > > LnkCtl LnkCtl > > > ------DevCap------- ----LnkCap------- -Before- -After-- > > > 00:01.2 L1 <32us L1+ L1- > > > 01:00.0 L1 <32us L1+ L1- > > > 02:03.0 L1 <32us L1+ L1+ > > > 03:00.0 L0s <512ns L1 <64us L0s <2us L1 <16us L0s- L1- L0s- L1- > > > > > > The NIC says it can tolerate at most 512ns of L0s exit latency and at > > > most 64us of L1 exit latency. > > > > > > 02:03.0 doesn't support L0s, and the NIC itself can't exit L0s that > > > fast anyway (it can only do <2us), so L0s should be out of the picture > > > completely. > > > > > > Before your patch, apparently we (or BIOS) enabled L1 on the link from > > > 00:01.2 to 01:00.0, and partially enabled it on the link from 02:03.0 > > > to 03:00.0. > > > > According to the spec, this is managed by the OS - which was the > > change introduced... > > BIOS frequently enables ASPM and, if CONFIG_PCIEASPM_DEFAULT=y, I > don't think Linux touches it unless a user requests it via sysfs. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=66ff14e59e8a "7d715a6c1ae5 ("PCI: add PCI Express ASPM support") added the ability for Linux to enable ASPM, but for some undocumented reason, it didn't enable ASPM on links where the downstream component is a PCIe-to-PCI/PCI-X Bridge. Remove this exclusion so we can enable ASPM on these links." > What does "grep ASPM .config" tell you? CONFIG_PCIEASPM=y CONFIG_PCIEASPM_POWERSAVE=y And all of this worked before the commit above. > Boot with "pci=earlydump" and the dmesg will tell us what the BIOS > did. > > If you do this in the unmodified kernel: > > # echo 1 > /sys/.../0000:03:00.0/l1_aspm > > it should enable L1 for 03:00.0. I'd like to know whether it actually > does, and whether the NIC behaves any differently when L1 is enabled > for the entire path instead of just the first three components. With an unmodified kernel, I have ~40 mbit bandwidth. > If the above doesn't work, you should be able to enable ASPM manually: > > # setpci -s03:00.0 CAP_EXP+0x10.w=0x0042 I used something similar to disable ASPM, which made the nic work again. > > > It looks like we *should* be able to enable L1 on both links since the > > > exit latency should be <33us (first link starts exit at T=0, completes > > > by T=32; second link starts exit at T=1, completes by T=33), and > > > 03:00.0 can tolerate up to 64us. > > > > > > I guess the effect of your patch is to disable L1 on the 00:01.2 - > > > 01:00.0 link? And that makes the NIC work better? I am obviously > > > missing something because I don't understand why the patch does that > > > or why it works better. > > > > It makes it work like normal again, like if i disable ASPM on the > > nic itself... > > I wonder if ASPM is just broken on this device. > __e1000e_disable_aspm() mentions hardware errata on a different Intel > NIC. I doubt it. The total time seems to surpass the time it can handle with it's buffers. Note that the nic is running with ASPM now, and it is working =) > > I don't know which value that reflects, up or down - since we do max > > of both values and > > it actually disables ASPM. > > > > What we can see is that the first device that passes the threshold > > is 01:00.0 > > I don't understand this. 03:00.0 claims to be able to tolerate 64us > of L1 exit latency. The path should only have <33us of latency, so > it's not even close to the 64us threshold. Well, this is why i dumped the raw data - i don't know how lspci reads it > > How can I read more data from PCIe without needing to add kprint... > > > > This is what lspci uses apparently: > > #define PCI_EXP_LNKCAP_L0S 0x07000 /* L0s Exit Latency */ > > #define PCI_EXP_LNKCAP_L1 0x38000 /* L1 Exit Latency */ > > > > But which latencies are those? up or down? > > I think the idea in aspm.c that exit latencies depend on which > direction traffic is going is incorrect. The components at the > upstream and downstream ends of the link may have different exit > latencies, of course. Yes, and the max latency is the maximum time the device can buffer before the link has to be up, so maximum time to establish link must be max(up, down) right?