On 07/17/2015 02:16 AM, Yijing Wang wrote:
Rajat Jain reported a deadlock when a hierarchical hot plug thread and aer recovery thread both run. https://lkml.org/lkml/2015/3/11/861 thread 1: pciehp_enable_slot() pciehp_configure_device() pci_bus_add_devices() device_attach(dev) device_lock(dev) //acquire device mutex successfully ... pciehp_probe(dev) __pci_hp_register() pci_create_slot() down_write(pci_bus_sem) //deadlock here thread 2: aer_isr_one_error() aer_process_err_device() do_recovery() broadcast_error_message() pci_walk_bus() down_read(&pci_bus_sem) //acquire pci_bus_sem successfully report_error_detected(dev) device_lock(dev) // deadlock here We use down_write(&pci_bus_sem) to protect the bus->slots list, because the bus->slots list is only accessed in drivers/pci/slot.c, we could introduce a new local mutex to protect bus->slots, and use down_read(&pci_bus_sem) instead of down_write(&pci_bus_sem) to protect the bus->devices list. Signed-off-by: Yijing Wang <wangyijing@xxxxxxxxxx>
I applied both patches to our system and ran a number of tests. Works fine as far as I can see. Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html