On Thu, Oct 10, 2024 at 07:10:34PM +0200, Lukas Wunner wrote: > Dennis reports a boot crash on recent Lenovo laptops with a USB4 dock. > > Since commit 0fc70886569c ("thunderbolt: Reset USB4 v2 host router") and > commit 59a54c5f3dbd ("thunderbolt: Reset topology created by the boot > firmware"), USB4 v2 and v1 Host Routers are reset on probe of the > thunderbolt driver. > > The reset clears the Presence Detect State and Data Link Layer Link Active > bits at the USB4 Host Router's Root Port and thus causes hot removal of > the dock. > > The crash occurs when pciehp is unbound from one of the dock's Downstream > Ports: pciehp creates a pci_slot on bind and destroys it on unbind. The > pci_slot contains a pointer to the pci_bus below the Downstream Port, but > a reference on that pci_bus is never acquired. The pci_bus is destroyed > before the pci_slot, so a use-after-free ensues when pci_slot_release() > accesses slot->bus. > > In principle this should not happen because pci_stop_bus_device() unbinds > pciehp (and therefore destroys the pci_slot) before the pci_bus is > destroyed by pci_remove_bus_device(). > > However the stacktrace provided by Dennis shows that pciehp is unbound > from pci_remove_bus_device() instead of pci_stop_bus_device(). > To understand the significance of this, one needs to know that the PCI > core uses a two step process to remove a portion of the hierarchy: It > first unbinds all drivers in the sub-hierarchy in pci_stop_bus_device() > and then actually removes the devices in pci_remove_bus_device(). > There is no precaution to prevent driver binding in-between > pci_stop_bus_device() and pci_remove_bus_device(). > > In Dennis' case, it seems removal of the hierarchy by pciehp races with > driver binding by pci_bus_add_devices(). pciehp is bound to the > Downstream Port after pci_stop_bus_device() has run, so it is unbound by > pci_remove_bus_device() instead of pci_stop_bus_device(). Because the > pci_bus has already been destroyed at that point, accesses to it result in > a use-after-free. > > One might conclude that driver binding needs to be prevented after > pci_stop_bus_device() has run. However it seems risky that pci_slot > points to pci_bus without holding a reference. Solely relying on correct > ordering of driver unbind versus pci_bus destruction is certainly not > defensive programming. > > If pci_slot has a need to access data in pci_bus, it ought to acquire a > reference. Amend pci_create_slot() accordingly. Dennis reports that the > crash is not reproducible with this change. > > Abridged stacktrace: > > pcieport 0000:00:07.0: PME: Signaling with IRQ 156 > pcieport 0000:00:07.0: pciehp: Slot #12 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl+ IbPresDis- LLActRep+ > pci_bus 0000:20: dev 00, created physical slot 12 > pcieport 0000:00:07.0: pciehp: Slot(12): Card not present > ... > pcieport 0000:21:02.0: pciehp: pcie_disable_notification: SLOTCTRL d8 write cmd 0 > Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP NOPTI > CPU: 13 UID: 0 PID: 134 Comm: irq/156-pciehp Not tainted 6.11.0-devel+ #1 > RIP: 0010:dev_driver_string+0x12/0x40 > pci_destroy_slot > pciehp_remove > pcie_port_remove_service > device_release_driver_internal > bus_remove_device > device_del > device_unregister > remove_iter > device_for_each_child > pcie_portdrv_remove > pci_device_remove > device_release_driver_internal > bus_remove_device > device_del > pci_remove_bus_device (recursive invocation) > pci_remove_bus_device > pciehp_unconfigure_device > pciehp_disable_slot > pciehp_handle_presence_or_link_change > pciehp_ist > > Reported-by: Dennis Wassenberg <Dennis.Wassenberg@xxxxxxxxxxx> > Tested-by: Dennis Wassenberg <Dennis.Wassenberg@xxxxxxxxxxx> > Closes: https://lore.kernel.org/r/6de4b45ff2b32dd91a805ec02ec8ec73ef411bf6.camel@xxxxxxxxxxx/ > Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx Applied with Mika's reviewed-by to pci/hotplug for v6.13, thanks! > --- > I am tempted to remove the call to device_release_driver() from > pci_stop_dev() and just rely on driver unbinding by device_del(). > It would simplify and rationalize the code. The call was introduced by > commit c4a0a5d964e9 (PCI: Move device_del() from pci_stop_dev() to > pci_destroy_dev()) without providing an explicit reason. > > Dennis stress-tested driver unbinding via device_del() without witnessing > any problems. The only downside I see is that it would re-introduce the > cosmetic issue avoided by commit 16b6c8bb687c ("PCI: Detach driver before > procfs & sysfs teardown on device remove"). > > Preventing driver binding after pci_stop_bus_device() should be achieved > by this one-line patch, though that's still racy as pci_bus_add_devices() > might revert the match_driver flag to true after pci_stop_bus_device() has > set it to false: > https://lore.kernel.org/r/Zv-dIHDXNNYomG2Y@xxxxxxxxx/ > > An alternative would be to serialize removal of the hierarchy with > pci_bus_add_devices() by way of pci_lock_rescan_remove(): > https://lore.kernel.org/r/20241003084342.27501-1-brgl@xxxxxxxx/ > > Both approaches are yet to be tested by Dennis. Personally I would like > to avoid the pci_lock_rescan_remove() approach. We should try to move > away from this big lock and use finer grained locking instead. So again, > just dropping the call to device_release_driver() would be the simplest > and most preferred approach from my point of view. Thoughts? > > drivers/pci/slot.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c > index 0f87cad..ed645c7 100644 > --- a/drivers/pci/slot.c > +++ b/drivers/pci/slot.c > @@ -79,6 +79,7 @@ static void pci_slot_release(struct kobject *kobj) > up_read(&pci_bus_sem); > > list_del(&slot->list); > + pci_bus_put(slot->bus); > > kfree(slot); > } > @@ -261,7 +262,7 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr, > goto err; > } > > - slot->bus = parent; > + slot->bus = pci_bus_get(parent); > slot->number = slot_nr; > > slot->kobj.kset = pci_slots_kset; > @@ -269,6 +270,7 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr, > slot_name = make_slot_name(name); > if (!slot_name) { > err = -ENOMEM; > + pci_bus_put(slot->bus); > kfree(slot); > goto err; > } > -- > 2.43.0 >