Re: [PATCH] PCI: fix kernel oops on bridge rmoval

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alex Chiang wrote:
* Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx>:
Thank you very much for testing.

We still have similar kernel oops (see below) with ACPI pci slot
detection driver. I guess the same problem would also occur with
acpiphp though I've not tried yet. I don't look at Trent's bus
notifier approach yet, but I think we need something like this to
fix this problem.

Here are steps to reproduce and kernel oops message.

* Steps to reproduce

(1) Load ACPI pci slot detection driver
(2) Remove the parent bridge of the slot
(3) Unload ACPI pci slot detection driver

Thanks for the report.

I believe this patch will fix the case for bridges. I haven't
tested what happens if we remove an endpoint yet though.

Can you try this?

Thanks.

/ac

---
commit 557ce38e78cf06ce16aefcf273051ea0bac3d35c
Author: Alex Chiang <achiang@xxxxxx>
Date:   Thu Mar 26 15:36:34 2009 -0600

    PCI: pci_create_slot / pci_destroy_slot need to grab reference to parent bus
If a logical hotunplug (remove) is performed on a PCI bridge claimed by
    a hotplug or slot detection driver, and then the hotplug/detection module
    is unloaded, we will encounter an oops:
Call Trace:
     [<a000000100039bc0>] die+0x1c0/0x2c0
                                    sp=e0000005062ff9e0 bsp=e0000005062f13b0
     [<a000000100039d00>] die_if_kernel+0x40/0x60
                                    sp=e0000005062ff9e0 bsp=e0000005062f1380
     [<a00000010003b590>] ia64_fault+0x1230/0x1280
                                    sp=e0000005062ff9e0 bsp=e0000005062f1300
     [<a00000010000c700>] ia64_native_leave_kernel+0x0/0x270
                                    sp=e0000005062ffbf0 bsp=e0000005062f1300
     [<a0000001003988f0>] pci_slot_release+0x70/0x1c0
                                    sp=e0000005062ffdc0 bsp=e0000005062f12b0
     [<a0000001003694d0>] kobject_release+0x4f0/0x5e0
                                    sp=e0000005062ffdc0 bsp=e0000005062f1270
     [<a00000010036b490>] kref_put+0xd0/0x100
                                    sp=e0000005062ffdc0 bsp=e0000005062f1248
     [<a000000100368650>] kobject_put+0x90/0xc0
                                    sp=e0000005062ffdc0 bsp=e0000005062f1220
     [<a000000100399260>] pci_destroy_slot+0xa0/0xe0
                                    sp=e0000005062ffdc0 bsp=e0000005062f11f0
     [<a0000002044d2c70>] pci_hp_deregister+0x510/0x560 [pci_hotplug]
                                    sp=e0000005062ffdc0 bsp=e0000005062f11a8
     [<a000000205d71aa0>] acpiphp_unregister_hotplug_slot+0x80/0x100 [acpiphp]
                                    sp=e0000005062ffdc0 bsp=e0000005062f1180
     [<a000000205d73d40>] cleanup_bridge+0x3a0/0x4c0 [acpiphp]
                                    sp=e0000005062ffdc0 bsp=e0000005062f1128
     [<a000000205d73ee0>] cleanup_p2p_bridge+0x80/0xc0 [acpiphp]
                                    sp=e0000005062ffdc0 bsp=e0000005062f1108
     [<a0000001003ed200>] acpi_ns_walk_namespace+0x160/0x2e0
                                    sp=e0000005062ffdc0 bsp=e0000005062f1098
     [<a0000001003e8850>] acpi_walk_namespace+0x90/0xe0
                                    sp=e0000005062ffdc0 bsp=e0000005062f1048
     [<a000000205d73f70>] remove_bridge+0x50/0xe0 [acpiphp]
                                    sp=e0000005062ffdc0 bsp=e0000005062f1028
     [<a000000100411590>] acpi_pci_unregister_driver+0x1f0/0x2a0
                                    sp=e0000005062ffdc0 bsp=e0000005062f0fe8
     [<a000000205d759d0>] acpiphp_glue_exit+0x30/0x60 [acpiphp]
                                    sp=e0000005062ffdc0 bsp=e0000005062f0fd0
     [<a000000205d77380>] acpiphp_exit+0x20/0x40 [acpiphp]
                                    sp=e0000005062ffdc0 bsp=e0000005062f0fb8
     [<a0000001000e9310>] sys_delete_module+0x410/0x520
                                    sp=e0000005062ffdc0 bsp=e0000005062f0f38
This is because pci_slot_release will access the parent PCI bus,
    which has already been released by the user's prior hot unplug.
The solution is for pci_create_slot to grab a reference on the
    parent PCI bus (and pci_destroy_slot to put the reference). This
    will prevent the parent from release while the hotplug or slot
    detection driver is loaded.
Reported-by: Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx>
    Signed-off-by: Alex Chiang <achiang@xxxxxx>

diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
index 2118944..459d6a2 100644
--- a/drivers/pci/slot.c
+++ b/drivers/pci/slot.c
@@ -248,6 +248,7 @@ placeholder:
 		if (PCI_SLOT(dev->devfn) == slot_nr)
 			dev->slot = slot;
+ get_device(&parent->dev);
 	dev_dbg(&parent->dev, "dev %02x, created physical slot %s\n",
 		slot_nr, pci_slot_name(slot));
@@ -302,6 +303,7 @@ void pci_destroy_slot(struct pci_slot *slot)
 		slot->number, atomic_read(&slot->kobj.kref.refcount) - 1);
down_write(&pci_bus_sem);
+	put_device(&slot->bus->dev);
 	kobject_put(&slot->kobj);
 	up_write(&pci_bus_sem);
 }


I've not tried your patch yet, but I don't think it works because
pci_create_slot() can be executed by some hotplug drivers (pciehp,
shpchp, ...) before parent->dev is initialized.

Anyway, I'll try it and report the result as soon as possible.

Thanks,
Kenji Kaneshige


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux