On Fri, 06 Nov 2009 11:25:13 +0900 Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx> wrote: > Alex Chiang wrote: > > Hello Kenji-san, > > > > I'm hitting a BUG_ON() in pcie_update_aspm_capable(), introduced > > in 07d92760d2ee542fe932f4e8b5807dd98481d1fd during a PCI logical > > hotplug operation. > > > > Hi Alex-san, > > Could you try the patch attached below? > > Thanks, > Kenji Kaneshige > > > > Fix the following BUG_ON() problem reported by Alex Chiang. > > This problem happened when removing PCIe root port using PCI logical > hotplug operation. > > The immediate cause of this problem is that the pointer to invalid > data structure is passed to pcie_update_aspm_capable() by > pcie_aspm_exit_link_state(). When pcie_aspm_exit_link_state() received > a pointer to root port link, it unconfigures the root port link and > frees its data structure at first. At this point, there are not links > to configure under the root port and the data structure for root port > link is already freed. So pcie_aspm_exit_link_state() must not call > pcie_update_aspm_capable() and pcie_config_aspm_path(). > > This patch fixes the problem by changing pcie_aspm_exit_link_state() > not to call pcie_update_aspm_capable() and pcie_config_aspm_path() if > the specified link is root port link. > > ------------[ cut here ]------------ > kernel BUG at drivers/pci/pcie/aspm.c:606! > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC > last sysfs file: /sys/devices/pci0000:40/0000:40:13.0/remove > CPU 1 > Modules linked in: shpchp > Pid: 9345, comm: sysfsd Not tainted 2.6.32-rc5 #98 ProLiant DL785 > G6 RIP: 0010:[<ffffffff811df69b>] [<ffffffff811df69b>] > pcie_update_aspm_capable+0x15/0xbe RSP: 0018:ffff88082a2f5ca0 > EFLAGS: 00010202 RAX: 0000000000000e77 RBX: ffff88182cc3e000 RCX: > ffff88082a33d006 RDX: 0000000000000001 RSI: ffffffff811dff4a RDI: > ffff88182cc3e000 RBP: ffff88082a2f5cc0 R08: ffff88182cc3e000 R09: > 0000000000000000 R10: ffff88182fc00180 R11: ffff88182fc00198 R12: > ffff88182cc3e000 R13: 0000000000000000 R14: ffff88182cc3e000 R15: > ffff88082a2f5e20 FS: 00007f259a64b6f0(0000) > GS:ffff880864600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 > ES: 0018 CR0: 000000008005003b CR2: 00007feb53f73da0 CR3: > 000000102cc94000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 Process sysfsd (pid: 9345, > threadinfo ffff88082a2f4000, task ffff88082a33cf00) Stack: > ffff88182cc3e000 ffff88182cc3e000 0000000000000000 ffff88082a33cf00 > <0> ffff88082a2f5cf0 ffffffff811dff52 ffff88082a2f5cf0 > ffff88082c525168 <0> ffff88402c9fd2f8 ffff88402c9fd2f8 > ffff88082a2f5d20 ffffffff811d7db2 Call Trace: > [<ffffffff811dff52>] pcie_aspm_exit_link_state+0xf5/0x11e > [<ffffffff811d7db2>] pci_stop_bus_device+0x76/0x7e > [<ffffffff811d7d67>] pci_stop_bus_device+0x2b/0x7e > [<ffffffff811d7e4f>] pci_remove_bus_device+0x15/0xb9 > [<ffffffff811dcb8c>] remove_callback+0x29/0x3a > [<ffffffff81135aeb>] sysfs_schedule_callback_work+0x15/0x6d > [<ffffffff81072790>] worker_thread+0x19d/0x298 > [<ffffffff8107273b>] ? worker_thread+0x148/0x298 > [<ffffffff81135ad6>] ? sysfs_schedule_callback_work+0x0/0x6d > [<ffffffff810765c0>] ? autoremove_wake_function+0x0/0x38 > [<ffffffff810725f3>] ? worker_thread+0x0/0x298 > [<ffffffff8107629e>] kthread+0x7d/0x85 > [<ffffffff8102eafa>] child_rip+0xa/0x20 > [<ffffffff8102e4bc>] ? restore_args+0x0/0x30 > [<ffffffff81076221>] ? kthread+0x0/0x85 > [<ffffffff8102eaf0>] ? child_rip+0x0/0x20 > Code: 89 e5 8a 50 48 31 c0 c0 ea 03 83 e2 07 e8 b2 de fe ff c9 48 98 > c3 55 48 89 e5 41 56 49 89 fe 41 55 41 54 53 48 83 7f 10 00 74 04 > <0f> 0b eb fe 48 8b 05 da 7d 63 00 4c 8d 60 e8 4c 89 e1 eb 24 4c RIP > [<ffffffff811df69b>] pcie_update_aspm_capable+0x15/0xbe RSP > <ffff88082a2f5ca0> ---[ end trace 6ae0f65bdeab8555 ]--- > > Reported-by: Alex Chiang <achiang@xxxxxx> > Signed-off-by: Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx> I applied this with Alex's tested by since I didn't see a new one. Hope that's ok with you, Kenji-san. If not, please send an incremental patch to fix things up. Thanks, -- Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html