Hi, it seems I can hit an infinite loop in the following function. The way to trigger it is a stressing night of suspend/resume cycles repeated all the time while some hiddev is being accessed. void usbhid_init_reports(struct hid_device *hid) { struct hid_report *report; struct usbhid_device *usbhid = hid->driver_data; int err, ret; list_for_each_entry(report, &hid->report_enum[HID_INPUT_REPORT].report_list, list) usbhid_submit_report(hid, report, USB_DIR_IN); list_for_each_entry(report, &hid->report_enum[HID_FEATURE_REPORT].report_list, list) usbhid_submit_report(hid, report, USB_DIR_IN); err = 0; ret = usbhid_wait_io(hid); while (ret) { err |= ret; if (test_bit(HID_CTRL_RUNNING, &usbhid->iofl)) usb_kill_urb(usbhid->urbctrl); if (test_bit(HID_OUT_RUNNING, &usbhid->iofl)) usb_kill_urb(usbhid->urbout); ret = usbhid_wait_io(hid); } if (err) hid_warn(hid, "timeout initializing reports\n"); } (I really don't understand why that loop in usbhid_init_reports) Here is also usbhid_wait_io for convenience: int usbhid_wait_io(struct hid_device *hid) { struct usbhid_device *usbhid = hid->driver_data; if (!wait_event_timeout(usbhid->wait, (!test_bit(HID_CTRL_RUNNING, &usbhid->iofl) && !test_bit(HID_OUT_RUNNING, &usbhid->iofl)), 10*HZ)) { dbg_hid("timeout waiting for ctrl or out queue to clear\n"); return -1; } return 0; } My feeling is that during the suspend procedure some USB jobs are flushed without alerting the waiters. What's bad is that if this happens, the HID_CTRL_RUNNING or HID_OUT_RUNNING are not cleared and the waiters keep waiting forever despite the timeouts. This becomes a problem when some waiter is holding such a resource that prevents a successful suspend (like hiddev->existancelock, if usbhid_init_reports is among the waiters). This is the only useful kind of trace I can extract from the ashes. I've quite a bit of these. It's a 2.6.32.9 based kernel but the relevant code seems almost unchanged. It's not easy for me to test a different kernel (but I'm managing to get there in a not so far future). [ 3697.245574] khubd D c03a1ac8 0 243 2 0x00000000 [ 3697.252227] [<c03a1ac8>] (schedule+0x314/0x3a8) from [<c03a26f4>] (__mutex_lock_slowpath+0xc0/0x150) [ 3697.261810] [<c03a26f4>] (__mutex_lock_slowpath+0xc0/0x150) from [<c02e9cc4>] (hiddev_disconnect+0x1c/0x78) [ 3697.272155] [<c02e9cc4>] (hiddev_disconnect+0x1c/0x78) from [<c02e171c>] (hid_disconnect+0x30/0x48) [ 3697.281616] [<c02e171c>] (hid_disconnect+0x30/0x48) from [<c02e176c>] (hid_device_remove+0x38/0x58) [ 3697.291107] [<c02e176c>] (hid_device_remove+0x38/0x58) from [<c021be88>] (__device_release_driver+0x64/0xa4) [ 3697.301391] [<c021be88>] (__device_release_driver+0x64/0xa4) from [<c021bfcc>] (device_release_driver+0x1c/0x28) [ 3697.312042] [<c021bfcc>] (device_release_driver+0x1c/0x28) from [<c021b574>] (bus_remove_device+0x6c/0x7c) [ 3697.322143] [<c021b574>] (bus_remove_device+0x6c/0x7c) from [<c0219df8>] (device_del+0x110/0x168) [ 3697.331420] [<c0219df8>] (device_del+0x110/0x168) from [<c02e1148>] (hid_destroy_device+0x20/0x3c) [ 3697.340789] [<c02e1148>] (hid_destroy_device+0x20/0x3c) from [<c02e79ec>] (usbhid_disconnect+0x2c/0x3c) [ 3697.350646] [<c02e79ec>] (usbhid_disconnect+0x2c/0x3c) from [<c024cf30>] (usb_unbind_interface+0x5c/0xfc) [ 3697.360656] [<c024cf30>] (usb_unbind_interface+0x5c/0xfc) from [<c021be88>] (__device_release_driver+0x64/0xa4) [ 3697.371368] [<c021be88>] (__device_release_driver+0x64/0xa4) from [<c021bfcc>] (device_release_driver+0x1c/0x28) [ 3697.382019] [<c021bfcc>] (device_release_driver+0x1c/0x28) from [<c021b574>] (bus_remove_device+0x6c/0x7c) [ 3697.392120] [<c021b574>] (bus_remove_device+0x6c/0x7c) from [<c0219df8>] (device_del+0x110/0x168) [ 3697.401397] [<c0219df8>] (device_del+0x110/0x168) from [<c024a1ec>] (usb_disable_device+0x7c/0xf0) [ 3697.410766] [<c024a1ec>] (usb_disable_device+0x7c/0xf0) from [<c024589c>] (usb_disconnect+0x88/0x10c) [ 3697.420410] [<c024589c>] (usb_disconnect+0x88/0x10c) from [<c02465ec>] (hub_thread+0x460/0xd80) [ 3697.429534] [<c02465ec>] (hub_thread+0x460/0xd80) from [<c00c92d8>] (kthread+0x7c/0x84) [ 3697.437896] [<c00c92d8>] (kthread+0x7c/0x84) from [<c008497c>] (kernel_thread_exit+0x0/0x8) [ 3699.232818] App D c03a1ac8 0 5968 1 0x00080001 [ 3699.239471] [<c03a1ac8>] (schedule+0x314/0x3a8) from [<c03a22c4>] (schedule_timeout+0x1c8/0x1fc) [ 3699.248687] [<c03a22c4>] (schedule_timeout+0x1c8/0x1fc) from [<c02e8ab8>] (usbhid_wait_io+0xa4/0xe8) [ 3699.258392] [<c02e8ab8>] (usbhid_wait_io+0xa4/0xe8) from [<c02e92fc>] (usbhid_init_reports+0xb0/0xe4) [ 3699.268035] [<c02e92fc>] (usbhid_init_reports+0xb0/0xe4) from [<c02ea758>] (hiddev_ioctl+0x28c/0x75c) [ 3699.277679] [<c02ea758>] (hiddev_ioctl+0x28c/0x75c) from [<c011dc6c>] (vfs_ioctl+0x2c/0x8c) [ 3699.286437] [<c011dc6c>] (vfs_ioctl+0x2c/0x8c) from [<c011e350>] (do_vfs_ioctl+0x58c/0x5d4) [ 3699.295166] [<c011e350>] (do_vfs_ioctl+0x58c/0x5d4) from [<c011e3e4>] (sys_ioctl+0x4c/0x6c) [ 3699.303894] [<c011e3e4>] (sys_ioctl+0x4c/0x6c) from [<c0083f40>] (ret_fast_syscall+0x0/0x2c) With this idea I tryed with a simple patch: diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index dedd8e4..46224c0 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -1383,7 +1383,9 @@ static void hid_cease_io(struct usbhid_device *usbhid) del_timer_sync(&usbhid->io_retry); usb_kill_urb(usbhid->urbin); usb_kill_urb(usbhid->urbctrl); + clear_bit(HID_CTRL_RUNNING, &usbhid->iofl); usb_kill_urb(usbhid->urbout); + clear_bit(HID_OUT_RUNNING, &usbhid->iofl); } but the HID_OUT_RUNNING is still not cleared and the loop is infinite. I've put some traces to show whether those usb_kill_urb() in usbhid_init_reports are actually killing anything but I'll see the results tomorrow. In the meanwhile I'm a bit out of clues, admitted that I'm right. Any suggestions? Regards, Domenico -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html