On Mon, Jul 17, 2023 at 03:04:52PM +0300, Ilpo Järvinen wrote: > PCI Express Capability RMW accessors don't properly protect against > concurrent access. Link Control Register is written by a number of > things in the kernel in a RMW fashion without any concurrency control. > This could in the unlucky case lead to losing one of the updates. One > of the most obvious path which can race with most of the other LNKCTL > RMW operations seems to be ASPM policy sysfs write which triggers > LNKCTL update. Similarly, Root Control Register can be concurrently > accessed by AER and PME. > > Make pcie_capability_clear_and_set_word() (and other RMW accessors that > call it) to use a per device spinlock to protect the RMW operations to > the Capability Registers that require locking. Convert open-coded > LNKCTL RMW operations to use pcie_capability_clear_and_set_word() to > benefit from the locking. > > There's also a related series which improves ASPM service driver and > device driver coordination by removing out-of-band ASPM state > management from device drivers (which will remove some of the code > fragments changed by this series but it has higher regression > potential which is why it seems prudent to do these changes in two > steps): > https://lore.kernel.org/linux-pci/20230602114751.19671-1-ilpo.jarvinen@xxxxxxxxxxxxxxx/T/#t > > v5: > - Remove reversed logic from a conditional > - Use a variable for CCC setup > > v4: > - Rebased on top of pci/main > - Added patch to update documentation > > v3: > - Split link retraining change off from ASPM patch & reorder it earlier > - Adjust changelog to take into account the move of link retraining > code into PCI core and no longer refer to ASPM (currently in > pci/enumeration branch) > - based on top of pci/main > > v2: > - Keep the RMW ops caller API the same > - Make pcie_capability_clear_and_set_word() a wrapper that uses > locked/unlocked variant based on the capability reg > - Extracted LNKCTL2 changes out from this series to keep this purely > a series which fixes something (LNKCTL2 RMW lock is necessary only > when PCIe BW control is introduced). > - Added Fixes tags (it's a bit rathole but yeah, they're there now). > - Renamed cap_lock to pcie_cap_lock > - Changed ath1* to clear the ASPMC field before setting it > > Ilpo Järvinen (11): > PCI: Add locking to RMW PCI Express Capability Register accessors > PCI: Make link retraining use RMW accessors for changing LNKCTL > PCI: pciehp: Use RMW accessors for changing LNKCTL > PCI/ASPM: Use RMW accessors for changing LNKCTL > drm/amdgpu: Use RMW accessors for changing LNKCTL > drm/radeon: Use RMW accessors for changing LNKCTL > net/mlx5: Use RMW accessors for changing LNKCTL > wifi: ath11k: Use RMW accessors for changing LNKCTL > wifi: ath12k: Use RMW accessors for changing LNKCTL > wifi: ath10k: Use RMW accessors for changing LNKCTL > PCI: Document the Capability accessor RMW improvements > > Documentation/PCI/pciebus-howto.rst | 14 ++++--- > drivers/gpu/drm/amd/amdgpu/cik.c | 36 +++++------------- > drivers/gpu/drm/amd/amdgpu/si.c | 36 +++++------------- > drivers/gpu/drm/radeon/cik.c | 36 +++++------------- > drivers/gpu/drm/radeon/si.c | 37 +++++-------------- > .../ethernet/mellanox/mlx5/core/fw_reset.c | 9 +---- > drivers/net/wireless/ath/ath10k/pci.c | 9 +++-- > drivers/net/wireless/ath/ath11k/pci.c | 10 +++-- > drivers/net/wireless/ath/ath12k/pci.c | 10 +++-- > drivers/pci/access.c | 20 ++++++++-- > drivers/pci/hotplug/pciehp_hpc.c | 12 ++---- > drivers/pci/pci.c | 8 +--- > drivers/pci/pcie/aspm.c | 30 +++++++-------- > drivers/pci/probe.c | 1 + > include/linux/pci.h | 34 ++++++++++++++++- > 15 files changed, 136 insertions(+), 166 deletions(-) Applied to pci/pcie-rmw for v6.6, thanks! I removed the stable tags because we don't know of any actual problems these fix. Bjorn