On Thu, Jul 30, 2020 at 1:12 AM Ian Kumlien <ian.kumlien@xxxxxxxxx> wrote: > > On Thu, Jul 30, 2020 at 1:02 AM Ian Kumlien <ian.kumlien@xxxxxxxxx> wrote: > > On Thu, Jul 30, 2020 at 12:47 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > On Sat, Jul 25, 2020 at 09:47:05PM +0200, Ian Kumlien wrote: > > > > Hi, > > > > > > > > A while ago I realised that I was having all kinds off issues with my > > > > connection, ~933 mbit had become ~40 mbit > > > > > > > > This only applied on links to the internet (via a linux fw running > > > > NAT) however while debugging with the help of Alexander Duyck > > > > we realised that ASPM could be the culprit (at least disabling ASPM on > > > > the nic it self made things work just fine)... > > > > > > > > So while trying to understand PCIe and such things, I found this: > > > > > > > > The calculations of the max delay looked at "that node" + start latency * "hops" > > > > > > > > But one hop might have a larger latency and break the acceptable delay... > > > > > > > > So after a lot playing around with the code, i ended up with this, and > > > > it seems to fix my problem and does > > > > set two pcie bridges to ASPM Disabled that didn't happen before. > > > > > > > > I do however have questions.... Shouldn't the change be applied to > > > > the endpoint? Or should it be applied recursively along the path to > > > > the endpoint? > > > > > > I don't understand this very well, but I think we do need to consider > > > the latencies along the entire path. PCIe r5.0, sec 5.4.1.3, contains > > > this: > > > > > > Power management software, using the latency information reported by > > > all components in the Hierarchy, can enable the appropriate level of > > > ASPM by comparing exit latency for each given path from Root to > > > Endpoint against the acceptable latency that each corresponding > > > Endpoint can withstand. > > > > One of the questions is this: > > They say from root to endpoint while we walk from endpoint to root > > > > So, is that more optimal in some way? or should latencies always be > > considered from root to endpoint? > > In that case, should the link ASPM be disabled somewhere else? > > (I tried to disable them on the "endpoint" and it didn't help for some reason) > > > > > Also this: > > > > [--8<--] > > > > > - For each component with one or more Endpoint Functions, PCI > > > Express system software examines the Endpoint L0s/L1 Acceptable > > > Latency, as reported by each Endpoint Function in its Link > > > Capabilities register, and enables or disables L0s/L1 entry (via > > > the ASPM Control field in the Link Control register) accordingly > > > in some or all of the intervening device Ports on that hierarchy. > > > > > > Also, the L0S checks are only done on the local links, is this > > > > correct? > > > > > > ASPM configuration is done on both ends of a link. I'm not sure it > > > makes sense to enable any state (L0s, L1, L1.1, L1.2) unless both ends > > > of the link support it. In particular, sec 5.4.1.3 says: > > > > > > Software must not enable L0s in either direction on a given Link > > > unless components on both sides of the Link each support L0s; > > > otherwise, the result is undefined. > > > > > > But I think we do need to consider the entire path when enabling L0s; > > > from sec 7.5.3.3: > > > > > > Endpoint L0s Acceptable Latency - This field indicates the > > > acceptable total latency that an Endpoint can withstand due to the > > > transition from L0s state to the L0 state. It is essentially an > > > indirect measure of the Endpoint’s internal buffering. Power > > > management software uses the reported L0s Acceptable Latency number > > > to compare against the L0s exit latencies reported by all components > > > comprising the data path from this Endpoint to the Root Complex Root > > > Port to determine whether ASPM L0s entry can be used with no loss of > > > performance. > > > > > > Does any of that help answer your question? > > > > Yes! It's exactly what I wanted to know, :) > > > > So now the question is should I group the fixes into one patch or > > separate them for easier bisecting? > > Actually this raises a few questions... > > It does sound like this is sum(link->latency_up.l0s) + > sum(link->latency_dw.l0s) of the link vs acceptable->l0s > > But, would that mean that we walk the link backwards? so it's both sides? > > Currently they are separated - and they are not diaabled as a whole... > > How should we handle the difference between up and down to keep the > finer grained control we have? so, this: diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index bd53fba7f382..18b659e6d3e8 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev, static void pcie_aspm_check_latency(struct pci_dev *endpoint) { - u32 latency, l1_max_latency = 0, l1_switch_latency = 0; + u32 latency, l1_max_latency = 0, l1_switch_latency = 0, + l0s_max_latency = 0; struct aspm_latency *acceptable; struct pcie_link_state *link; @@ -448,14 +449,18 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint) while (link) { /* Check upstream direction L0s latency */ - if ((link->aspm_capable & ASPM_STATE_L0S_UP) && - (link->latency_up.l0s > acceptable->l0s)) - link->aspm_capable &= ~ASPM_STATE_L0S_UP; + if (link->aspm_capable & ASPM_STATE_L0S_UP) { + l0s_max_latency += link->latency_up.l0s; + if (l0s_max_latency > acceptable->l0s) + link->aspm_capable &= ~ASPM_STATE_L0S_UP; + } /* Check downstream direction L0s latency */ - if ((link->aspm_capable & ASPM_STATE_L0S_DW) && - (link->latency_dw.l0s > acceptable->l0s)) - link->aspm_capable &= ~ASPM_STATE_L0S_DW; + if (link->aspm_capable & ASPM_STATE_L0S_DW) { + l0s_max_latency += link->latency_dw.l0s; + if (l0s_max_latency > acceptable->l0s) + link->aspm_capable &= ~ASPM_STATE_L0S_DW; + } /* * Check L1 latency. * Every switch on the path to root complex need 1 --- vs ---- diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index bd53fba7f382..74dee09e788b 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev, static void pcie_aspm_check_latency(struct pci_dev *endpoint) { - u32 latency, l1_max_latency = 0, l1_switch_latency = 0; + u32 latency, l1_max_latency = 0, l1_switch_latency = 0, + l0s_max_latency = 0; struct aspm_latency *acceptable; struct pcie_link_state *link; @@ -448,14 +449,17 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint) while (link) { /* Check upstream direction L0s latency */ - if ((link->aspm_capable & ASPM_STATE_L0S_UP) && - (link->latency_up.l0s > acceptable->l0s)) - link->aspm_capable &= ~ASPM_STATE_L0S_UP; + if (link->aspm_capable & ASPM_STATE_L0S_UP) + l0s_max_latency += link->latency_up.l0s; /* Check downstream direction L0s latency */ - if ((link->aspm_capable & ASPM_STATE_L0S_DW) && - (link->latency_dw.l0s > acceptable->l0s)) - link->aspm_capable &= ~ASPM_STATE_L0S_DW; + if (link->aspm_capable & ASPM_STATE_L0S_DW) + l0s_max_latency += link->latency_dw.l0s; + + /* If the latency exceeds, disable both */ + if (l0s_max_latency > acceptable->l0s) + link->aspm_capable &= ~ASPM_STATE_L0S; + /* * Check L1 latency. * Every switch on the path to root complex need 1 ---- Currently we don't make a difference between them and I don't quite understand the upstream and downstream terminology since it's a serial bus ;)