On Fri, Oct 18, 2024 at 05:47:46PM +0300, Ilpo Järvinen wrote: > Hi all, > > This series adds PCIe bandwidth controller (bwctrl) and associated PCIe > cooling driver to the thermal core side for limiting PCIe Link Speed > due to thermal reasons. PCIe bandwidth controller is a PCI express bus > port service driver. A cooling device is created for each port the > service driver finds to support changing speeds. > > This series only adds support for controlling PCIe Link Speed. > Controlling PCIe Link Width might also be useful but there is no > mechanism for that until PCIe 6.0 (L0p) so Link Width throttling is not > added by this series. > > > v9: > - Split RMW ops doc reformat into own patch before adding LNKCTL2. > - Comment reserved 0 LSB even better than it already was. > - Consider portdrv future plans: > - Use devm helpers for mem alloc, IRQ, and mutex init. > - Don't use get/set_service_data(). > - Split rwsem into two to avoid recursive locking splat through > pcie_retrain_link(). > - Small wording improvements to commit messages (from Jonathan) > > v8: > - Removed CONFIG_PCIE_BWCTRL (use CONFIG_PCIEPORTBUS) > - Removed locking wrappers that dealt with the CONFIG variations > - Protect macro parameter with parenthesis to be on the safe side > > v7: > - Rebased on top of Maciej's latest Target Speed quirk patches > - Target Speed quirk runs very early, w/o ->subordinate existing yet. > This required adapting logic: > - Move Supported Link Speeds back to pci_dev > - Check for ->subordinate == NULL where necessary > - Cannot always take bwctrl's per port mutex (in pcie_bwctrl_data) > - Cleaned up locking in pcie_set_target_speed() using wrappers > - Allowed removing confusing __pcie_set_target_speed() > - Fix building with CONFIG_PCI=n > - Correct error check in pcie_lbms_seen() > - Don't return error for an empty bus that remains at 2.5GT > - Use rwsem to protect ->link_bwctrl setup and bwnotif enable > - Clear LBMS in remove_board() > - Adding export for pcie_get_supported_speeds() was unnecessary > - Call bwctrl's init before hotplug. > - Added local variable 'bus' into a few functions > > v6: > - Removed unnecessary PCI_EXP_LNKCAP_SLS mask from PCIE_LNKCAP_SLS2SPEED() > - Split error handling in pcie_bwnotif_irq_thread() > - pci_info() -> pci_dbg() on bwctrl probe success path > - Handle cooling device pointer -Exx codes in bwctrl probe > - Reorder port->link_bwctrl setup / bwnotif enable for symmetry > - Handle LBMS count == 0 in PCIe quirk by checking LBMS (avoids a race > between quirk and bwctrl) > - Use cleanup.h in PCIe cooling device's register > > v5: > - Removed patches: LNKCTL2 RMW driver patches went in separately > - Refactor pcie_update_link_speed() to read LNKSTA + add __ variant > for hotplug that has LNKSTA value at hand > - Make series fully compatible with the Target Speed quirk > - LBMS counter added, quirk falls back to LBMS bit when bwctrl =n > - Separate LBMS patch from set target speed patches > - Always provide pcie_bwctrl_change_speed() even if bwctrl =n so drivers > don't need to come up their own version (also required by the Target > Speed quirk) > - Remove devm_* (based on Lukas' comment on some other service > driver patch) > - Convert to use cleanup.h > - Renamed functions/struct to have shorter names > > v4: > - Merge Port's and Endpoint's Supported Link Speeds Vectors into > supported_speeds in the struct pci_bus > - Reuse pcie_get_speed_cap()'s code for pcie_get_supported_speeds() > - Setup supported_speeds with PCI_EXP_LNKCAP2_SLS_2_5GB when no > Endpoint exists > - Squash revert + add bwctrl patches into one > - Change to use threaded IRQ + IRQF_ONESHOT > - Enable also LABIE / LABS > - Convert Link Speed selection to use bit logic instead of loop > - Allocate before requesting IRQ during probe > - Use devm_*() > - Use u8 for speed_conv array instead of u16 > - Removed READ_ONCE() > - Improve changelogs, comments, and Kconfig > - Name functions slightly more consistently > - Use bullet list for RMW protected registers in docs > > v3: > - Correct hfi1 shortlog prefix > - Improve error prints in hfi1 > - Add L: linux-pci to the MAINTAINERS entry > > v2: > - Adds LNKCTL2 to RMW safe list in Documentation/PCI/pciebus-howto.rst > - Renamed cooling devices from PCIe_Port_* to PCIe_Port_Link_Speed_* in > order to plan for possibility of adding Link Width cooling devices > later on > - Moved struct thermal_cooling_device declaration to the correct patch > - Small tweaks to Kconfig texts > - Series rebased to resolve conflict (in the selftest list) > > Ilpo Järvinen (9): > Documentation PCI: Reformat RMW ops documentation > PCI: Protect Link Control 2 Register with RMW locking > PCI: Store all PCIe Supported Link Speeds > PCI: Refactor pcie_update_link_speed() > PCI/quirks: Abstract LBMS seen check into own function > PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller > PCI/bwctrl: Add API to set PCIe Link Speed > thermal: Add PCIe cooling driver > selftests/pcie_bwctrl: Create selftests > > Documentation/PCI/pciebus-howto.rst | 14 +- > MAINTAINERS | 9 + > drivers/pci/hotplug/pciehp_ctrl.c | 5 + > drivers/pci/hotplug/pciehp_hpc.c | 2 +- > drivers/pci/pci.c | 62 ++- > drivers/pci/pci.h | 38 +- > drivers/pci/pcie/Makefile | 2 +- > drivers/pci/pcie/bwctrl.c | 366 ++++++++++++++++++ > drivers/pci/pcie/portdrv.c | 9 +- > drivers/pci/pcie/portdrv.h | 6 +- > drivers/pci/probe.c | 15 +- > drivers/pci/quirks.c | 32 +- > drivers/thermal/Kconfig | 9 + > drivers/thermal/Makefile | 2 + > drivers/thermal/pcie_cooling.c | 80 ++++ > include/linux/pci-bwctrl.h | 28 ++ > include/linux/pci.h | 24 +- > include/uapi/linux/pci_regs.h | 1 + > tools/testing/selftests/Makefile | 1 + > tools/testing/selftests/pcie_bwctrl/Makefile | 2 + > .../pcie_bwctrl/set_pcie_cooling_state.sh | 122 ++++++ > .../selftests/pcie_bwctrl/set_pcie_speed.sh | 67 ++++ > 22 files changed, 843 insertions(+), 53 deletions(-) > create mode 100644 drivers/pci/pcie/bwctrl.c > create mode 100644 drivers/thermal/pcie_cooling.c > create mode 100644 include/linux/pci-bwctrl.h > create mode 100644 tools/testing/selftests/pcie_bwctrl/Makefile > create mode 100755 tools/testing/selftests/pcie_bwctrl/set_pcie_cooling_state.sh > create mode 100755 tools/testing/selftests/pcie_bwctrl/set_pcie_speed.sh Applied to pci/bwctrl for v6.13, thanks Ilpo (and Alexandru, for the bandwidth notification interrupt support)!