[PATCH v3 0/4] PCIe hotplug interrupt related fixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

This patchset tries to address 2 PCIe hotplug interrupt related problems
we met recently:

1. Firmware developers reported that they received two PCIe hotplug commands
   in very short intervals on an ARM server, which doesn't comply with PCIe
   spec, and broke their state machine and work flow.
2. An irq storm bug found when testing "pci=nomsi" case, and the root
   cause is: 'nomsi' will disable MSI and let devices and root ports use
   legacy INTX interrupt, and likely make several devices/ports share one
   interrupt. In the failure case, BIOS doesn't disable the pcie hotplug
   interrupts, and actually asserts the command-complete interrupt.

More details could be found in commit log of patch 2/4 and 4/4. Basically:
    Patch 0001 moves the PCIe hotplug command waiting funtion from pciehp
               driver to PCIe port driver for code reuse.
    Patch 0002 adds the necessary wait for PCIe hotplug command
    Patch 0003 loose the condition check for interrupt disabling
    Patch 0004 for msi disabled case, disable PCIe hotplug interrupt in
               early boot phase 

Please help to review, thanks!

- Feng

Changelog:

  since v2:
    * Add patch 0001, which move the waiting logic of pcie_poll_cmd from pciehp
      driver to PCIe port driver for code reuse (Bjorn Helgaas)
    * Separate Lucas' suggestion out as patch 0003 (Bjorn and Sathyanarayanan)  
    * Avoid hotplug command waiting for HW without command-complete
      event support (Bjorn Helgaas)
    * Fix spell issue in commit log (Bjorn and Markus)
    * Add cover-letter for whole patchset (Markus Elfring)
    * Handle a set-but-unused build warning (0Day lkp bot)

  since v1:
    * Add the Originally-by for Liguang for patch 0002. The issue was found on
      a 5.10 kernel, then 6.6. I was initially given a 5.10 kernel tar ball
      without git info to debug the issue, and made the patch. Thanks to Guanghui
      who recently pointed me to tree https://gitee.com/anolis/cloud-kernel which
      show the wait logic in 5.10 was originally from Liguang, and never hit
      mainline.
    * Make the irq disabling not dependent on wthether pciehp service driver
      will be loaded (Lukas Wunner) 
    * Use read_poll_timeout() API to simply the waiting logic (Sathyanarayanan
      Kuppuswamy)
    * Fix wrong email address (Markus Elfring)
    * Add logic to skip irq disabling if it is already disabled.


Feng Tang (4):
  PCI: portdrv: pciehp: Move PCIe hotplug command waiting logic to port
    driver
  PCI/portdrv: Add necessary wait for disabling hotplug events
  PCI/portdrv: Loose the condition check for disabling hotplug
    interrupts
  PCI: Disable PCIe hotplug interrupts early when msi is disabled

 drivers/pci/hotplug/pciehp_hpc.c | 38 ++++++------------------
 drivers/pci/pci.h                |  7 +++++
 drivers/pci/pcie/portdrv.c       | 50 ++++++++++++++++++++++++++++----
 drivers/pci/probe.c              |  9 ++++++
 4 files changed, 70 insertions(+), 34 deletions(-)

-- 
2.43.5





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux