About change 456d8aa37d0f "PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") Let's say root port 00:03.0 was connected to a PCIe switch. [00]-+.. | +--03.0-[01-03]--+-00.0-[02-03]--+-02.0-[02]--x | | \-0d.0-[03]----00.0 Endpoint . +-00.1 Upstream Port Sibling And that switch had not only an upstream port at 01:00.0, but also a sibling function at 01:00.1. Let's break the link under 00:03.0, which makes pciehp remove the [01] bus. Surprise effect: traversal during bus [01] device removal happens in reverse order (for SR-IOV-ish reasons, see pciehp_pci.c commentary). Fair enough, ASPM should probably not rely on any specific order anyway. Recursing through pci_remove_bus_device() underneath, the order in which we pci_destroy_dev() will be: [ 01:00.1 [ 02:02.0 [ 03:00.0 ] 02:0d.0 ] 01:00.0 ] Trivially, the above is also the order in which pcie_aspm_exit_link_state() will be called. Then note how, since above change removed the list_empty() exit condition, we are now going to remove the pcie_link_state for bus [01] (parent=<00:03.0>) during the first invocation, i.e. right at pcie_aspm_exit_link_state(<01:00.1>). Iow: with bus [03] removal only to come, we removed the pcie_link_state upstream first, and only then will remove the downstream pcie_link_state at parent=<02:0d.0>. Eventually reaching that second link, it carries a ref "parent_link = link->parent" which now points to free'd memory again. One can observe a rather high probability of finishing with a random GPF or nullptr dereference condition. Above switches, with MFD upstream portions, exist. Case at hand is a PEX8717 with 4 DMA engines: +-08.0-[51-5b]--+-00.0-[52-5b]--+-02.0-[53]-- | \-0d.0-[54-5b]----00.0 Broadcom Inc. … +-00.1 PLX Technology, Inc. PEX PCI Express Switch DMA interface +-00.2 PLX Technology, Inc. PEX PCI Express Switch DMA interface +-00.3 PLX Technology, Inc. PEX PCI Express Switch DMA interface \-00.4 PLX Technology, Inc. PEX PCI Express Switch DMA interface Backtrace: [ 790.817077] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 790.900514] #PF: supervisor read access in kernel mode [ 790.962081] #PF: error_code(0x0000) - not-present page [ 791.023641] PGD 8000000104648067 P4D 8000000104648067 PUD 1404bc067 PMD 0 [ 791.106041] Oops: 0000 [#1] PREEMPT SMP PTI [ 791.156151] CPU: 8 PID: 145 Comm: irq/43-pciehp Tainted: G OE 6.1.0-22-2-amd64 #5 Debian 6.1.94-1 [ 791.279173] Hardware name: Intel Camelback Mountain CRB/Camelback Mountain CRB, BIOS Aboot-norcal7-7.1.6-generic-22971530 06/30/2021 [ 791.421982] RIP: 0010:pcie_config_aspm_link+0x48/0x330 [ 791.483548] Code: 48 8b 04 25 28 00 00 00 48 89 44 24 30 31 c0 8b 47 30 4c 8b 47 08 83 e3 7f c1 e8 0e f7 d3 89 c2 83 e0 7f 21 c3 83 e2 7f 21 f3 <41> 8b b6 a0 00 00 00 89 d8 83 e0 87 f6 c3 04 0f 44 d8 0f b7 47 30 [ 791.708646] RSP: 0018:ffffa8cf8062bcb8 EFLAGS: 00010246 [ 791.771254] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 791.856772] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94bac3d56a80 [ 791.942291] RBP: ffff94bac3d56a80 R08: 0000000000000000 R09: ffffa8cf8062bc6c [ 792.027808] R10: 0000000000000000 R11: 0000000000000004 R12: ffff94b9c0f61fc0 [ 792.113326] R13: ffff94bbc0ae9828 R14: 0000000000000000 R15: ffff94b9c0ea6f20 [ 792.198845] FS: 0000000000000000(0000) GS:ffff94c8ffc00000(0000) … [ 792.295827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 792.364680] CR2: 00000000000000a0 CR3: 00000001062fe003 CR4: 00000000003706e0 [ 792.450198] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 792.535717] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 792.621235] Call Trace: [ 792.650506] <TASK> [ 792.675616] ? __die_body.cold+0x1a/0x1f [ 792.722600] ? page_fault_oops+0xd2/0x2b0 [ 792.770625] ? sysvec_apic_timer_interrupt+0xa/0x90 [ 792.829068] ? exc_page_fault+0x70/0x170 [ 792.876051] ? asm_exc_page_fault+0x22/0x30 [ 792.926161] ? pcie_config_aspm_link+0x48/0x330 [ 792.980437] pcie_aspm_exit_link_state+0xb9/0x120 [ 793.036796] pci_remove_bus_device+0xc8/0x110 [ 793.088988] pci_remove_bus_device+0x2e/0x110 [ 793.141180] pci_remove_bus_device+0x3e/0x110 [ 793.193373] pciehp_unconfigure_device+0x94/0x160 [ 793.249733] pciehp_disable_slot+0x69/0x100 [ 793.299840] pciehp_handle_presence_or_link_change+0x241/0x350 [ 793.369742] pciehp_ist+0x164/0x170 [ 793.411524] ? disable_irq_nosync+0x10/0x10 [ 793.461632] irq_thread_fn+0x1f/0x60 [ 793.504449] irq_thread+0xfa/0x1c0 [ 793.545185] ? irq_thread_fn+0x60/0x60 [ 793.590085] ? irq_thread_check_affinity+0xf0/0xf0 [ 793.647485] kthread+0xda/0x100 [ 793.685096] ? kthread_complete_and_exit+0x20/0x20 [ 793.742495] ret_from_fork+0x22/0x30 [ 793.785314] </TASK> Daniel Stodden (1): PCI/ASPM: fix link state exit during switch upstream function removal. drivers/pci/pcie/aspm.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) -- 2.47.0