On Wed, Jun 12, 2024 at 11:16:25AM -0700, Keith Busch wrote: > DPC and AER handling access their subordinate bus devices. If pciehp > should happen to also trigger during this handling, it will remove > all the subordinate buses, then dereferecing any children may be a > use-after-free. That may lead to kernel panics like the below. I assume the crash occurs because the struct pci_bus accessed by pci_bus_read_config_dword() has already been freed? Generally the solution to issues like this is to hold references on the structs being accessed, not to acquire locks that happen to prevent the structs from being freed. Question is, which struct ref needs to be held and where? Holding a ref on a struct pci_dev also holds the pci_bus it resides on in place. So I suspect we need to call pci_dev_get() somewhere. The stack trace looks incomplete for some reason: > ? pci_bus_read_config_dword+0x17/0x50 > pci_dev_wait+0x107/0x190 > ? dpc_completed+0x50/0x50 > dpc_reset_link+0x4e/0xd0 > pcie_do_recovery+0xb2/0x2d0 I'd expect a call to pci_bridge_wait_for_secondary_bus() from dpc_reset_link(), which in turn calls pci_dev_wait(). Indeed pci_bridge_wait_for_secondary_bus() does something fishy: It takes the first entry from the devices list without acquiring a ref: child = list_first_entry(&dev->subordinate->devices, struct pci_dev, bus_list); Below is a small patch which acquires a ref on child. Maybe this already does the trick? -- >8 -- diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2a8063e..82db9a8 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4753,7 +4753,7 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus) */ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type) { - struct pci_dev *child; + struct pci_dev *child __free(pci_dev_put) = NULL; int delay; if (pci_dev_is_disconnected(dev)) @@ -4782,8 +4782,9 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type) return 0; } - child = list_first_entry(&dev->subordinate->devices, struct pci_dev, - bus_list); + + child = pci_dev_get(list_first_entry(&dev->subordinate->devices, + struct pci_dev, bus_list)); up_read(&pci_bus_sem); /*