From: Stanislav Spassov <stanspas@xxxxxxxxx> The first version of this patch series can be found here: https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@xxxxxxxxxx Originally (v1), this patch series aimed to only solve an issue where pci_dev_wait can cause system crashes. After a reset, a hung device may keep responding with CRS completions indefinitely. If CRS Software Visibility is enabled on the Root Port, attempting to read any register other than PCI_VENDOR_ID will cause the Root Port to autonomously retry the request without reporting back to the CPU core. Unless the number of retries or the amount of time spent retrying is limited by platform-specific means, this scenario leads to low-level platform timeouts (such as a TOR Timeout), which easily escalate to a crash. The feedback on the first version of this patch series inspired a deeper dive into the PCI Firmware Spec (_DSM functions 8 and 9), which revealed several different types of delays that can be overriden on a per-device basis to avoid waiting for too long on device that are known to come back quickly after reset. The kernel already stores such overrides for some, but not all of the delays. While adding the infrastructure to allow overriding delays, I discovered and addressed several inconsistencies between what the PCIE Base Specification says and what the code does, and came up with more improvements all around device resets and readiness polling. This patch series now paves the way for Readiness Time Reporting capability support, and touches upon (in comments) some changes that would be required for supporting Readiness Notifications. [Compared to v2, v3 fixes build failures on i386 and arm/arm64: Reported-by: kbuild test robot <lkp@xxxxxxxxx> - int(value_us / 1000) does not work for u64 value_us due to: undefined reference to `__udivdi3' Change: use '(int)value_us / 1000' to match pre-existing code. It seems this would be susceptible to overflow/truncation ? - I had failed to replace all mentions of PCI_PM_D3COLD_WAIT after renaming that constant to PCI_RESET_DELAY.] Stanislav Spassov (17): PCI: Fall back to slot/bus reset if softer methods timeout PCI: Remove unused PCI_PM_BUS_WAIT PCI: Use pci_bridge_wait_for_secondary_bus after SBR PCI: Do not override delay for D0->D3hot transition PCI: Fix handling of _DSM 8 (avoiding reset delays) PCI: Fix us->ms conversion in pci_acpi_optimize_delay PCI: Clean up and document PM/reset delays PCI: Add more delay overrides to struct pci_dev PCI: Generalize pci_bus_max_d3cold_delay to pci_bus_max_delay PCI: Use correct delay in pci_bridge_wait_for_secondary_bus PCI: Refactor pci_dev_wait to remove timeout parameter PCI: Refactor pci_dev_wait to take pci_init_event PCI: Cache CRS Software Visibiliy in struct pci_dev PCI: Introduce per-device reset_ready_poll override PCI: Refactor polling loop out of pci_dev_wait PCI: Add CRS handling to pci_dev_wait() PCI: Lower PCIE_RESET_READY_POLL_MS from 1m to 1s Documentation/power/pci.rst | 4 +- arch/x86/pci/intel_mid_pci.c | 2 +- drivers/hid/intel-ish-hid/ipc/ipc.c | 2 +- drivers/mfd/intel-lpss-pci.c | 2 +- drivers/net/ethernet/marvell/sky2.c | 2 +- drivers/pci/controller/pci-aardvark.c | 2 +- drivers/pci/controller/pci-mvebu.c | 2 +- drivers/pci/iov.c | 4 +- drivers/pci/pci-acpi.c | 106 ++++++++---- drivers/pci/pci-driver.c | 4 +- drivers/pci/pci.c | 233 ++++++++++++++++++-------- drivers/pci/pci.h | 81 ++++++++- drivers/pci/probe.c | 10 +- drivers/pci/quirks.c | 9 +- include/linux/pci-acpi.h | 8 +- include/linux/pci.h | 45 ++++- 16 files changed, 390 insertions(+), 126 deletions(-) base-commit: bb6d3fb354c5ee8d6bde2d576eb7220ea09862b9 -- 2.25.1 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879