PCIe ASPM crash on device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stratus has encountered crashes in 3.8.0-rc2 ASPM tracking code when 
hotplugging PCI buses.  The call trace of a crash usually looks like:

pci_stop_and_remove_bus_device
  pci_stop_bus_device
    pcie_aspm_exit_link_state
      pcie_update_aspm_capable
        pcie_aspm_check_latency

where ASPM is trying to access a freed data structure.  When instrumenting
the ASPM init/exit code, it seems that pcie_aspm_exit_link_state is 
skipping over one of our PCIe Downstream Ports that 
pcie_aspm_init_link_state had prior allocated link_state.  Subsequent 
trips through ASPM exit code traverse through the hierarchy and crash 
trying to access freed memory (slub_debug was set to FZPU and  
pcie_aspm_check_latency is trying to access a link* that is set to 
0x6b6b6b6b6b6b6b6b.) 

A brief summary of the init/exit link_state routines is as follows.

  pcie_aspm_init_link_state will alloc_pcie_link_state if:
  
  1 - Device is PCIe
  2 - Hasn't already allocated a link_state
  3 - Device is PCI_EXP_TYPE_ROOT_PORT or PCI_EXP_TYPE_DOWNSTREAM
  4 - Device is not VIA chipset (root port under bridge)
  5 - Device's subordinate device list is not empty


  pcie_aspm_exit_link_state will free_link_state of *parent* device if:
  
  1 - Device is PCIe
  2 - Device has a parent device
  3 - Parent device has a link_state
  4 - Parent device is PCI_EXP_TYPE_ROOT_PORT or PCI_EXP_TYPE_DOWNSTREAM
  5 - Device is the last entry in parent's subordinate device list


The Stratus PCI topology includes a branch that looks like: 

... -00.0-[03-3c]----00.0-[04-2d]--+- ...
                                   |
                                   \-01.0-[2c-2d]--+-00.0
                                                   +-00.1
                                                   \-1f.0

% lspci -vs 03:00.0
03:00.0 PCI bridge: Device 1bcf:0004 (rev 01) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Memory at 93000000 (64-bit, non-prefetchable) [size=4K]
        Bus: primary=03, secondary=04, subordinate=2d, sec-latency=0
        I/O behind bridge: 00001000-00005fff
        Memory behind bridge: 84000000-92ffffff
        Prefetchable memory behind bridge: 00000000c4000000-00000000c7ffffff
        Capabilities: [b0] Express Upstream Port, MSI 00
        Capabilities: [ec] Power Management version 2
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: pcieport

% lspci -vs 04:01.0
04:01.0 PCI bridge: Device 1bcf:0009 (rev 01) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=04, secondary=2c, subordinate=2d, sec-latency=0
        I/O behind bridge: 00005000-00005fff
        Memory behind bridge: 90000000-92ffffff
        Capabilities: [b0] Express Downstream Port (Slot-), MSI 00
        Capabilities: [ec] Power Management version 2
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: pcieport

% lspci -vs 2c:00.0
2c:00.0 USB Controller: Intel Corporation Patsburg USB2 Enhanced Host 
                        Controller #1 (rev 06) (prog-if 20 [EHCI])
        Subsystem: Device 1bcf:8002
        Flags: bus master, medium devsel, latency 0, IRQ 10
        Memory at 90000000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Debug port: BAR=1 offset=00a0
        Capabilities: [98] PCI Advanced Features
        Kernel driver in use: ehci-pci

% lspci -vs 2c:00.1
2c:00.1 USB Controller: Intel Corporation Patsburg USB2 Enhanced Host
                        Controller #2 (rev 06) (prog-if 20 [EHCI])
        Subsystem: Device 1bcf:8002
        Flags: bus master, medium devsel, latency 0, IRQ 11
        Memory at 90001000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Debug port: BAR=1 offset=00a0
        Capabilities: [98] PCI Advanced Features
        Kernel driver in use: ehci-pci

% lspci -vs 2c:1f.0
2c:1f.0 ISA bridge: Intel Corporation Patsburg LPC Controller (rev 06)
        Subsystem: Device 1bcf:8002
        Flags: bus master, medium devsel, latency 0
        Capabilities: [e0] Vendor Specific Information: Len=0c <?>

When debugging is added to pcie_init_link_state, 04:01.0 does set 
blacklist to 1, as the pcie_aspm_sanity_check figures out that at least 
one of its children is not PCIe.  However, all of the criteria 1-5 above 
is met and a link_state is allocated for this device.

On device removal, pcie_aspm_exit_link_state will never clean up 04:01.0 as
2c:00.0, 2c:00.1, and most importantly 2c:1f.0 are not PCIe (criteria 1).


In reply to this mail, I will post two patches I have been testing this 
week:

[PATCH 1/2] PCI: ASPM exit link state code could skip devices

In looking at pcie_aspm_exit_link_state, might it be better to verify that
the *parent* device is PCIe rather than the device itself?  This would
prevent early exit when the last function is not PCIe and also verify that
the parent is PCIe before calling pci_pcie_type on it.

Furthermore, I didn't investigate this, but does the strategy of the 
last device function out cleaning up the parent link_state leave another
whole -- what if there are no child devices in the first place?

I'd be happy to test other approaches here.

[PATCH 2/2] PCI: Don't touch ASPM if forcibly disabled

Matthew Garrett had provided us this patch to use against a RHEL 6.3 
kernel.  It prevents ASPM from allocating any link_state when 
'pcie_aspm=off' is specified on the kernel command line.  Stratus has been 
using this patch to avoid this crash in RHEL 6.3.

When testing with Patch 1, this second patch was unnecessary (I removed 
any pcie_aspm directives from the boot line).  However given its utility 
in avoiding the crash in the first place, it would be nice to offer the 
user a means to turn off ASPM tracking to avoid other potential issues.

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux