Hi, I've been tracking down two issues and one of them seems to be a problem with either usbcore or xhci. DWC3, when acting as host, instantiates an xhci platform-device and sets itself as the parent of that. That's all fine and dandy until I try to modprobe -r dwc3.ko which causes XHCI to hang: | # lsmod | Module Size Used by | xhci_hcd 116180 0 | dwc3 46765 0 | udc_core 10472 1 dwc3 | dwc3_omap 5402 0 | matrix_keypad 7218 0 | lis3lv02d_i2c 3718 0 | lis3lv02d 16439 1 lis3lv02d_i2c | input_polldev 5315 1 lis3lv02d | # lsusb | Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub | Bus 001 Device 005: ID 0b95:7720 ASIX Electronics Corp. AX88772 | Bus 001 Device 004: ID 1a40:0101 Terminus Technology Inc. 4-Port HUB | Bus 001 Device 003: ID 0403:6001 Future Technology Devices International, Ltd FT232 USB-Serial (UART) IC | Bus 001 Device 002: ID 1a40:0201 Terminus Technology Inc. FE 2.1 7-port Hub | Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub | # modprobe -r dwc3 | [ 53.016798] xhci-hcd xhci-hcd.0.auto: remove, state 4 | [ 53.023083] usb usb2: USB disconnect, device number 1 | [ 53.082845] xhci-hcd xhci-hcd.0.auto: Host not halted after 16000 microseconds. | [ 53.090732] xhci-hcd xhci-hcd.0.auto: USB bus 2 deregistered | [ 53.112511] xhci-hcd xhci-hcd.0.auto: remove, state 1 | [ 53.117883] usb usb1: USB disconnect, device number 1 | [ 53.123301] usb 1-1: USB disconnect, device number 2 | [ 53.128503] usb 1-1.6: USB disconnect, device number 3 | [ 90.539781] INFO: task modprobe:1792 blocked for more than 30 seconds. | [ 90.546607] Not tainted 3.17.0-rc2-00004-ge0b64425 #800 | [ 90.552672] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. | [ 90.560855] modprobe D c06bf5a0 0 1792 1662 0x00000000 | [ 90.567541] [<c06bf5a0>] (__schedule) from [<c06bfa94>] (schedule+0x40/0x8c) | [ 90.574925] [<c06bfa94>] (schedule) from [<c06c3e48>] (schedule_timeout+0x154/0x220) | [ 90.583031] [<c06c3e48>] (schedule_timeout) from [<c06c0554>] (wait_for_common+0xdc/0x178) | [ 90.591672] [<c06c0554>] (wait_for_common) from [<c06c0610>] (wait_for_completion+0x20/0x24) | [ 90.600537] [<c06c0610>] (wait_for_completion) from [<bf0569d4>] (xhci_configure_endpoint+0xc8/0x590 [xhci_hcd]) | [ 90.611226] [<bf0569d4>] (xhci_configure_endpoint [xhci_hcd]) from [<bf057664>] (xhci_check_bandwidth+0x16c/0x294 [xhci_hcd]) | [ 90.623100] [<bf057664>] (xhci_check_bandwidth [xhci_hcd]) from [<c04e5578>] (usb_hcd_alloc_bandwidth+0x1dc/0x320) | [ 90.633938] [<c04e5578>] (usb_hcd_alloc_bandwidth) from [<c04e8160>] (usb_disable_device+0x198/0x1f8) | [ 90.643586] [<c04e8160>] (usb_disable_device) from [<c04df3fc>] (usb_disconnect+0x7c/0x224) | [ 90.652323] [<c04df3fc>] (usb_disconnect) from [<c04df54c>] (usb_disconnect+0x1cc/0x224) | [ 90.660778] 8 locks held by modprobe/1792: | [ 90.665055] #0: (&dev->mutex){......}, at: [<c0439c04>] driver_detach+0x54/0xc8 | [ 90.672929] #1: (&dev->mutex){......}, at: [<c0439c10>] driver_detach+0x60/0xc8 | [ 90.680798] #2: (&dev->mutex){......}, at: [<c0439524>] device_release_driver+0x28/0x3c | [ 90.689373] #3: (usb_bus_list_lock){+.+.+.}, at: [<c04e4e04>] usb_remove_hcd+0xa0/0x1b4 | [ 90.697971] #4: (&dev->mutex){......}, at: [<c04df3d0>] usb_disconnect+0x50/0x224 | [ 90.706022] #5: (&dev->mutex){......}, at: [<c04df3d0>] usb_disconnect+0x50/0x224 | [ 90.714069] #6: (&dev->mutex){......}, at: [<c04df3d0>] usb_disconnect+0x50/0x224 | [ 90.722109] #7: (hcd->bandwidth_mutex){+.+.+.}, at: [<c04e814c>] usb_disable_device+0x184/0x1f8 This only happens when I have devices attached to the XHCI port on my platform (AM437x, but I suppose any XHCI would die similarly if you can destroy the underlying {platform,pci}_device. If I first remove xhci then remove dwc3, it works fine: | # lsmod | Module Size Used by | xhci_hcd 116180 0 | dwc3 46765 0 | udc_core 10472 1 dwc3 | matrix_keypad 7218 0 | dwc3_omap 5402 0 | lis3lv02d_i2c 3718 0 | lis3lv02d 16439 1 lis3lv02d_i2c | input_polldev 5315 1 lis3lv02d | # lsusb | Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub | Bus 001 Device 005: ID 0b95:7720 ASIX Electronics Corp. AX88772 | Bus 001 Device 004: ID 1a40:0101 Terminus Technology Inc. 4-Port HUB | Bus 001 Device 003: ID 0403:6001 Future Technology Devices International, Ltd FT232 USB-Serial (UART) IC | Bus 001 Device 002: ID 1a40:0201 Terminus Technology Inc. FE 2.1 7-port Hub | Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub | # modprobe -r xhci-hcd | [ 38.895745] xhci-hcd xhci-hcd.0.auto: remove, state 4 | [ 38.902034] usb usb2: USB disconnect, device number 1 | [ 38.933439] xhci-hcd xhci-hcd.0.auto: USB bus 2 deregistered | [ 38.945408] xhci-hcd xhci-hcd.0.auto: remove, state 1 | [ 38.950968] usb usb1: USB disconnect, device number 1 | [ 38.956280] usb 1-1: USB disconnect, device number 2 | [ 38.961563] usb 1-1.6: USB disconnect, device number 3 | [ 38.980267] usb 1-1.7: USB disconnect, device number 4 | [ 38.985710] usb 1-1.7.4: USB disconnect, device number 5 | [ 38.994068] asix 1-1.7.4:1.0 eth1: unregister 'asix' usb-xhci-hcd.0.auto-1.7.4, ASIX AX88772 USB 2.0 Ethernet | [ 39.122913] xhci-hcd xhci-hcd.0.auto: USB bus 1 deregistered | # modprobe -r dwc3 | # It also works fine I don't have anything attached to the XHCI port: | # lsmod | Module Size Used by | xhci_hcd 116180 0 | dwc3 46765 0 | udc_core 10472 1 dwc3 | matrix_keypad 7218 0 | dwc3_omap 5402 0 | lis3lv02d_i2c 3718 0 | lis3lv02d 16439 1 lis3lv02d_i2c | input_polldev 5315 1 lis3lv02d | # lsusb | Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub | Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub | # modprobe -r dwc3 | [ 63.910052] xhci-hcd xhci-hcd.0.auto: remove, state 4 | [ 63.915429] usb usb2: USB disconnect, device number 1 | [ 63.959522] xhci-hcd xhci-hcd.0.auto: Host not halted after 16000 microseconds. | [ 63.967461] xhci-hcd xhci-hcd.0.auto: USB bus 2 deregistered | [ 63.981720] xhci-hcd xhci-hcd.0.auto: remove, state 4 | [ 63.987160] usb usb1: USB disconnect, device number 1 | [ 64.006709] xhci-hcd xhci-hcd.0.auto: USB bus 1 deregistered if you want to know, this is running v3.17-rc2 but I know that at least v3.14 also exibits the same problem. Any suggestions on how to get this thing sorted out ? I'm pretty much running out of ideas :-s The second problem I have is exposed because I reverted commit c5a1fbc (usb: dwc3: dwc3-omap: Fix the crash on module removal) because that fix is wrong, it had a side effect of modprobe -r dwc3-omap *NOT* destroying the platform_device for dwc3.ko which wouldn't cause dwc3.ko to unprobed and its resources would not be destroyed. I traced this one down to __release_resource() getting a NULL pointer dereference when grabbing a pointer to old->parent->child, but I can't seem to figure out exactly what is wrong there. It doesn't seem, to me, that old->parent or old->parent->child should ever be NULL... Any ideas? | # modprobe -r dwc3-omap | [ 539.835401] Unable to handle kernel NULL pointer dereference at virtual address 00000018 | [ 539.844043] pgd = eb83c000 | [ 539.846893] [00000018] *pgd=00000000 | [ 539.850734] Internal error: Oops: 5 [#1] SMP ARM | [ 539.855588] Modules linked in: xhci_hcd matrix_keypad dwc3_omap(-) lis3lv02d_i2c lis3lv02d input_polldev [last unloaded: udc_core] | [ 539.867977] CPU: 0 PID: 1878 Comm: modprobe Not tainted 3.17.0-rc2-00004-ge0b64425 #800 | [ 539.876384] task: ed0d4040 ti: ed07c000 task.ti: ed07c000 | [ 539.882076] PC is at release_resource+0x24/0x90 | [ 539.886847] LR is at lock_acquired+0x280/0x3b8 | [ 539.891509] pc : [<c004eba8>] lr : [<c0091f8c>] psr: 60000013 | [ 539.891509] sp : ed07ddf0 ip : ed07dd80 fp : ed07de04 | [ 539.903570] r10: 00000000 r9 : ed07c000 r8 : c000f064 | [ 539.909061] r7 : 00000081 r6 : c0577eec r5 : ed564c00 r4 : eb97da80 | [ 539.915900] r3 : 00000000 r2 : 00000000 r1 : 60000013 r0 : c004eba4 | [ 539.922740] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user | [ 539.930238] Control: 10c5387d Table: ab83c059 DAC: 00000015 | [ 539.936274] Process modprobe (pid: 1878, stack limit = 0xed07c248) | [ 539.942751] Stack: (0xed07ddf0 to 0xed07e000) | [ 539.947324] dde0: 00000001 ed564c00 ed07de1c ed07de08 | [ 539.955897] de00: c043b670 c004eb90 ed564c00 00000000 ed07de34 ed07de20 c043b6bc c043b600 | [ 539.964476] de20: c0c55528 ed564c10 ed07de4c ed07de38 c0577f78 c043b6ac ed564c10 00000000 | [ 539.973065] de40: ed07de74 ed07de50 c0435ac4 c0577ef8 ed20eb40 ed487578 ed07de84 ed20f410 | [ 539.981649] de60: ed210010 ed210044 ed07de84 ed07de78 c0577ee4 c0435a7c ed07de9c ed07de88 | [ 539.990206] de80: bf013310 c0577ed0 ed210010 bf013e6c ed07deac ed07dea0 c043b010 bf0132c4 | [ 539.998764] dea0: ed07dec4 ed07deb0 c04394a8 c043aff4 ed210010 bf013e6c ed07dee4 ed07dec8 | [ 540.007339] dec0: c0439c74 c0439434 ed0d4040 bf013e6c 00000000 00000800 ed07defc ed07dee8 | [ 540.015914] dee0: c04391a4 c0439bbc bf01391c bf013e6c ed07df14 ed07df00 c043a4e4 c0439154 | [ 540.024498] df00: bf01391c bf013eb0 ed07df24 ed07df18 c043b7c4 c043a4b8 ed07df34 ed07df28 | [ 540.033082] df20: bf013930 c043b7b4 ed07dfa4 ed07df38 c00cab3c bf013928 ed07df54 00000000 | [ 540.041650] df40: bf013eb0 00000800 ed07df3c 33637764 616d6f5f 00000070 ed07df84 ed07df68 | [ 540.050246] df60: c00906a4 c00904ec b7007220 b7007254 00000000 00000081 ed07df94 ed07df88 | [ 540.058818] df80: c00907fc 00090584 00000000 b7007220 b7007254 00000000 00000000 ed07dfa8 | [ 540.067412] dfa0: c000ede0 c00caa28 b7007220 b7007254 b7007254 00000800 b7006000 000254b8 | [ 540.076003] dfc0: b7007220 b7007254 00000000 00000081 b7007254 00000001 b7007008 b70072b0 | [ 540.084595] dfe0: b6f31420 be99b76c b6feff98 b6f3142c 60000010 b7007254 ed064e2b 50b60016 | [ 540.093219] [<c004eba8>] (release_resource) from [<c043b670>] (platform_device_del+0x7c/0xac) | [ 540.102181] [<c043b670>] (platform_device_del) from [<c043b6bc>] (platform_device_unregister+0x1c/0x30) | [ 540.112048] [<c043b6bc>] (platform_device_unregister) from [<c0577f78>] (of_platform_device_destroy+0x8c/0x98) | [ 540.122557] [<c0577f78>] (of_platform_device_destroy) from [<c0435ac4>] (device_for_each_child+0x54/0x80) | [ 540.132612] [<c0435ac4>] (device_for_each_child) from [<c0577ee4>] (of_platform_depopulate+0x20/0x28) | [ 540.142312] [<c0577ee4>] (of_platform_depopulate) from [<bf013310>] (dwc3_omap_remove+0x58/0x78 [dwc3_omap]) | [ 540.152634] [<bf013310>] (dwc3_omap_remove [dwc3_omap]) from [<c043b010>] (platform_drv_remove+0x28/0x2c) | [ 540.162665] [<c043b010>] (platform_drv_remove) from [<c04394a8>] (__device_release_driver+0x80/0xd4) | [ 540.172233] [<c04394a8>] (__device_release_driver) from [<c0439c74>] (driver_detach+0xc4/0xc8) | [ 540.181251] [<c0439c74>] (driver_detach) from [<c04391a4>] (bus_remove_driver+0x5c/0xb0) | [ 540.189750] [<c04391a4>] (bus_remove_driver) from [<c043a4e4>] (driver_unregister+0x38/0x58) | [ 540.198601] [<c043a4e4>] (driver_unregister) from [<c043b7c4>] (platform_driver_unregister+0x1c/0x20) | [ 540.208274] [<c043b7c4>] (platform_driver_unregister) from [<bf013930>] (dwc3_omap_driver_exit+0x14/0x1c [dwc3_omap]) | [ 540.219407] [<bf013930>] (dwc3_omap_driver_exit [dwc3_omap]) from [<c00cab3c>] (SyS_delete_module+0x120/0x1b0) | [ 540.229943] [<c00cab3c>] (SyS_delete_module) from [<c000ede0>] (ret_fast_syscall+0x0/0x48) | [ 540.238617] Code: e1a04000 e59f006c eb19da12 e5943010 (e5932018) | [ 540.245128] ---[ end trace ee0e6e3f9c9ba6ac ]--- | [ 540.249985] note: modprobe[1878] exited with preempt_count 1 | Segmentation fault | # FYI, PC dies at line 241 on kernel/resource.c: | (gdb) l *(release_resource + 0x24) | 0xc004eba8 is in release_resource (kernel/resource.c:241). | 236 { | 237 struct resource *tmp, **p; | 238 | 239 p = &old->parent->child; | 240 for (;;) { | 241 tmp = *p; | 242 if (!tmp) | 243 break; | 244 if (tmp == old) { | 245 *p = tmp->sibling; Based on that, either old->parent or old->parent->child is NULL. But considering that that virtual address is 0x18 (24 bytes offset) that would be, if I can calculate correctly, the child offset inside parent. So parent is NULL and NULL->child = 0x18. cheers -- balbi
Attachment:
signature.asc
Description: Digital signature