On Fri, 3 Nov 2023, Mario Limonciello wrote: > The USB4 spec specifies that PCIe ports that are used for tunneling > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and > behave as a PCIe Gen1 device. The actual performance of these ports is > controlled by the fabric implementation. > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available() > to program the device will always find the PCIe ports used for > tunneling as a limiting factor potentially leading to incorrect > performance decisions. > > To prevent problems in downstream drivers check explicitly for ports > being used for PCIe tunneling and skip them when looking for bandwidth > limitations of the hierarchy. If the only device connected is a root port > used for tunneling then report that device. > > Downstream drivers could make this change on their own but then they > wouldn't be able to detect other potential speed bottlenecks from the > hierarchy without duplicating pcie_bandwidth_available() logic. > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860 > Link: https://www.usb.org/document-library/usb4r-specification-v20 > USB4 V2 with Errata and ECN through June 2023 > Section 11.2.1 > Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx> > --- > drivers/pci/pci.c | 74 +++++++++++++++++++++++++++++++---------------- > 1 file changed, 49 insertions(+), 25 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index d9aa5a39f585..15e37164ce56 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -6223,6 +6223,35 @@ int pcie_set_mps(struct pci_dev *dev, int mps) > } > EXPORT_SYMBOL(pcie_set_mps); > > +static u32 pcie_calc_bw_limits(struct pci_dev *dev, u32 bw, > + struct pci_dev **limiting_dev, > + enum pci_bus_speed *speed, > + enum pcie_link_width *width) > +{ > + enum pcie_link_width next_width; > + enum pci_bus_speed next_speed; > + u32 next_bw; > + u16 lnksta; > + > + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta); > + next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS]; > + next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT; > + next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed); > + > + /* Check if current device limits the total bandwidth */ > + if (!bw || next_bw <= bw) { > + bw = next_bw; > + if (limiting_dev) > + *limiting_dev = dev; > + if (speed) > + *speed = next_speed; > + if (width) > + *width = next_width; > + } > + > + return bw; > +} > + > /** > * pcie_bandwidth_available - determine minimum link settings of a PCIe > * device and its bandwidth limitation > @@ -6236,47 +6265,42 @@ EXPORT_SYMBOL(pcie_set_mps); > * limiting_dev, speed, and width pointers are supplied) information about > * that point. The bandwidth returned is in Mb/s, i.e., megabits/second of > * raw bandwidth. > + * > + * This excludes the bandwidth calculation that has been returned from a > + * PCIe device used for transmitting tunneled PCIe traffic over a Thunderbolt > + * or USB4 link that is part of larger hierarchy. The calculation is excluded > + * because the USB4 specification specifies that the max speed returned from > + * PCIe configuration registers for the tunneling link is always PCI 1x 2.5 GT/s. > + * When only tunneled devices are present, the bandwidth returned is the > + * bandwidth available from the first tunneled device. > */ > u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev, > enum pci_bus_speed *speed, > enum pcie_link_width *width) > { > - u16 lnksta; > - enum pci_bus_speed next_speed; > - enum pcie_link_width next_width; > - u32 bw, next_bw; > + struct pci_dev *tdev = NULL; > + u32 bw = 0; > > if (speed) > *speed = PCI_SPEED_UNKNOWN; > if (width) > *width = PCIE_LNK_WIDTH_UNKNOWN; > > - bw = 0; > - > while (dev) { > - pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta); > - > - next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS]; > - next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >> > - PCI_EXP_LNKSTA_NLW_SHIFT; > - > - next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed); > - > - /* Check if current device limits the total bandwidth */ > - if (!bw || next_bw <= bw) { > - bw = next_bw; > - > - if (limiting_dev) > - *limiting_dev = dev; > - if (speed) > - *speed = next_speed; > - if (width) > - *width = next_width; > + if (dev->is_tunneled) { > + if (!tdev) > + tdev = dev; > + goto skip; > } > - > + bw = pcie_calc_bw_limits(dev, bw, limiting_dev, speed, width); > +skip: > dev = pci_upstream_bridge(dev); > } > > + /* If nothing "faster" found on link, limit to first tunneled device */ > + if (tdev && !bw) > + bw = pcie_calc_bw_limits(tdev, bw, limiting_dev, speed, width); > + > return bw; > } > EXPORT_SYMBOL(pcie_bandwidth_available); > This patch should be split into two, where one just moves the code to the new function. Also note that this will conflict with the FIELD_GET() changes (try to not reintroduce non-FIELD_GET() code when you rebase this on top of v6.7-rc1 :-)). -- i.