Re: [PATCH] PCI: fix kernel oops on bridge rmoval

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx>:
> Hi,
> 
> I encountered the kernel oops when I tried bridge removal using
> Alex's logical hotplug interface on Jesse's linux-next. I'm
> attaching the patch to solve this problem. See the description
> of the attached patch for details.
> 
> This patch is against Jesse's linux-next.
> 
> Thanks,
> Kenji Kaneshige
> 
> 
> Fix the following kernel oops problem that happens when removing PCI
> bridge with pciehp loaded. It should also occur with other hotplug
> driver that is implemented as a bridge's driver.
> 
> [  459.997257] pciehp 0000:2f:04.0:pcie24: unloading service driver pciehp
> [  459.997495] general protection fault: 0000 [#1] SMP
> [  459.997737] last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/0000:2e:00.0/0000:2f:04.0/remove
> [  459.997964] CPU 4
> [  459.998129] Modules linked in: pciehp ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc cpufreq_ondemand acpi_cpufreq dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod sbs sbshc battery ac parport_pc lp parport mptspi mptscsih mptbase scsi_transport_spi e1000e sg sr_mod cdrom button serio_raw i2c_i801 i2c_core shpchp pcspkr ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
> [  459.998129] Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1 PRIMERGY
> [  459.998129] RIP: 0010:[<ffffffff803bf047>]  [<ffffffff803bf047>] pci_slot_release+0x37/0x100
> [  459.998129] RSP: 0018:ffff88083b3bf9e0  EFLAGS: 00010246
> [  459.998129] RAX: ffff88083adc5158 RBX: ffff880836c1bc80 RCX: 6b6b6b6b6b6b6b6b
> [  459.998129] RDX: 0000000000000000 RSI: ffffffff803a77f0 RDI: ffff880836c1bc48
> [  459.998129] RBP: ffff88083b3bfa00 R08: 0000000000000002 R09: 0000000000000000
> [  459.998129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880836c1bc48
> [  459.998129] R13: ffff880836c1bc20 R14: ffff880836c1bc48 R15: ffff880836d1ec38
> [  459.998129] FS:  0000000000000000(0000) GS:ffff88083ccc3770(0000) knlGS:0000000000000000
> [  459.998129] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [  459.998129] CR2: 00007f1562f1d558 CR3: 0000000838090000 CR4: 00000000000006e0
> [  459.998129] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  459.998129] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  459.998129] Process events/4 (pid: 56, threadinfo ffff88083b3be000, task ffff88083b3b3e40)
> [  459.998129] Stack:
> [  459.998129]  ffff880836c1bc80 ffff880836c1bc48 ffffffff80793320 ffff88083b0d0960
> [  459.998129]  ffff88083b3bfa30 ffffffff803a788a ffff880836c1bc80 ffffffff803a77f0
> [  459.998129]  ffff880836c1bc20 ffff880836d1ec38 ffff88083b3bfa50 ffffffff803a8ce7
> [  459.998129] Call Trace:
> [  459.998129]  [<ffffffff803a788a>] kobject_release+0x9a/0x290
> [  459.998129]  [<ffffffff803a77f0>] ? kobject_release+0x0/0x290
> [  459.998129]  [<ffffffff803a8ce7>] kref_put+0x37/0x80
> [  459.998129]  [<ffffffff803a76f7>] kobject_put+0x27/0x60
> [  459.998129]  [<ffffffff803bebcc>] ? pci_destroy_slot+0x3c/0xc0
> [  459.998129]  [<ffffffff803bebd5>] pci_destroy_slot+0x45/0xc0
> [  459.998129]  [<ffffffff803c797d>] pci_hp_deregister+0x13d/0x210
> [  459.998129]  [<ffffffffa031141d>] cleanup_slots+0x2d/0x80 [pciehp]
> [  459.998129]  [<ffffffffa0311735>] pciehp_remove+0x15/0x30 [pciehp]
> [  459.998129]  [<ffffffff803c4c99>] pcie_port_remove_service+0x69/0x90
> [  459.998129]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
> [  459.998129]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
> [  459.998129]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
> [  459.998129]  [<ffffffff8043e46b>] device_del+0x12b/0x190
> [  459.998129]  [<ffffffff803c4d90>] ? remove_iter+0x0/0x40
> [  459.998129]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
> [  459.998129]  [<ffffffff803c4dbf>] remove_iter+0x2f/0x40
> [  459.998129]  [<ffffffff8043ddf3>] device_for_each_child+0x33/0x60
> [  459.998129]  [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
> [  459.998129]  [<ffffffff803c4d30>] pcie_port_device_remove+0x30/0x80
> [  459.998129]  [<ffffffff803c55a1>] pcie_portdrv_remove+0x11/0x20
> [  459.998129]  [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
> [  459.998129]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
> [  459.998129]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
> [  459.998129]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
> [  459.998129]  [<ffffffff8043e46b>] device_del+0x12b/0x190
> [  459.998129]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
> [  459.998129]  [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
> [  459.998129]  [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
> [  459.998129]  [<ffffffff803c10d9>] remove_callback+0x29/0x40
> [  459.998129]  [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
> [  459.998129]  [<ffffffff8025769a>] run_workqueue+0x15a/0x230
> [  459.998129]  [<ffffffff80257648>] ? run_workqueue+0x108/0x230
> [  459.998129]  [<ffffffff8025846f>] worker_thread+0x9f/0x100
> [  459.998129]  [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
> [  459.998129]  [<ffffffff802583d0>] ? worker_thread+0x0/0x100
> [  459.998129]  [<ffffffff8025b89d>] kthread+0x4d/0x80
> [  459.998129]  [<ffffffff8020d4ba>] child_rip+0xa/0x20
> [  459.998129]  [<ffffffff8020cebc>] ? restore_args+0x0/0x30
> [  459.998129]  [<ffffffff8025b850>] ? kthread+0x0/0x80
> [  459.998129]  [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
> [  459.998129] Code: 56 49 89 fe 41 55 4c 8d 6f d8 41 54 53 74 09 f6 05 b8 05 c7 00 08 75 72 49 8b 45 00 48 8b 48 28 eb 05 66 90 48 89 f1 49 8b 45 00 <48> 8b 31 48 83 c0 28 0f 18 0e 48 39 c1 74 1c 8b 41 38 41 0f b6
> [  459.998129] RIP  [<ffffffff803bf047>] pci_slot_release+0x37/0x100
> [  459.998129]  RSP <ffff88083b3bf9e0>
> [  460.018595] ---[ end trace 5a08d2095374aedc ]---
> 
> The pci_remove_bus_device() removes all buses and devices under the
> bridge, and then remove the bridge. So the remove() callback of the
                   removes
> hotplug drivers implemented as a bridge's driver is executed after the
> struct pci_bus of the bridge's secondary bus is removed. The remove()
> callback of those driver deregister the slot using pci_destroy_slot(),
                           unregisters
> and slot's release callback refers the struct pci_bus that was already
                              refers to the
> freed. This is the cause of the kernel oops.
> 
> This patch solves the problem by stop all the driver before removing
> the bridge and its childe bus and devices.
                     child
> 

Good catch, thank you Kenji-san. I didn't see this because I
didn't have hotplug drivers loaded during my testing. :-/

I was thinking originally of making the hotplug drivers register
a bus notifier, similar to what Trent did with his new legacy
fakephp which is probably still necessary, but this change is a
good start.

I tested this patch on my machines and it works fine in the "no
hotplug drivers" loaded case.

Jesse, can you just clean up the changelog (and patch title)
before applying?

Thanks.

Acked-by: Alex Chiang <achiang@xxxxxx>

> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx>
> 
>  drivers/pci/remove.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> Index: linux-next-20090323/drivers/pci/remove.c
> ===================================================================
> --- linux-next-20090323.orig/drivers/pci/remove.c
> +++ linux-next-20090323/drivers/pci/remove.c
> @@ -95,6 +95,7 @@ EXPORT_SYMBOL(pci_remove_bus);
>   */
>  void pci_remove_bus_device(struct pci_dev *dev)
>  {
> +	pci_stop_bus_device(dev);
>  	if (dev->subordinate) {
>  		struct pci_bus *b = dev->subordinate;
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux