On Sun, 22 Nov 2009, Simon Arlott wrote: > 21 19:37:36 [58320.379525] temperhum D 000034e2 0 2212 2210 > would have 3 USB devices open, but two of them are on other controllers > and the devices are accessed serially. The fact that they are on other controllers need not matter. After all, the usb_kill_urb_queue wait_queue_head is shared among all the buses. > There are another 3 more USB > devices; I didn't check all of them but at least one was still working > properly. The other two would not have been actively in use. > > If it happens again I'll get a stack trace of all tasks. Aren't they > stuck in the following call inside usb_kill_urb()? > wait_event(usb_kill_urb_queue, atomic_read(&urb->use_count) == 0); Yes. And I was wrong before -- they aren't stuck waiting for the spinlock; they are waiting for the condition to be true. > Before that, it calls usb_hcd_unlink_urb(), but the return value is > ignored? Yes. At this point we just want to make sure the URB gets killed. We don't care if it was already killed by someone else or if it wasn't running in the first place. You'd get into trouble if you called usb_kill_urb() for an URB that was already deallocated, but that can't happen in usb_start_wait_urb(). > One possible path is: > usb_hcd_unlink_urb() > unlink1() // use_count > 0 > usb_rh_urb_dequeue() > usb_hcd_check_unlink_urb() > > which could return non-zero stopping usb_hcd_unlink_urb() from calling > usb_hcd_giveback_urb() and then the use_count isn't decremented. It > looks like the hardware would then be responsible for causing > usb_hcd_giveback_urb() to run, unless usb_hcd_poll_rh_status() runs. Not the hardware -- the software. URBs sent to root hubs are handled by usbcore and the controller drivers. But the URBs in question weren't sent to root hubs; the URB from temperhum was sent to a serial device managed by the ch341 driver and the URB from khubd was sent to an external hub. The only reason for these calls to get stuck is that either the OHCI controller hardware or the ohci-hcd driver got confused and failed to complete the URBs properly. > This code in usb_hcd_check_unlink_urb() does not match its comment: > /* Any status except -EINPROGRESS means something already started to > * unlink this URB from the hardware. So there's no more work to do. > */ > if (urb->unlinked) > return -EBUSY; > urb->unlinked = status; > "Any status except -EINPROGRESS" but it doesn't check what the value of > urb->unlinked is, so it could be -EINPROGRESS. The comment is out of date. It should say something like: If urb->unlinked is nonzero then someone else has already started to unlink this URB from the hardware, so there's nothing more to do. > I can try connecting multiple hub devices which won't stay stable again, > but not on the same hardware (although it'll still be an OHCI HCD). Go ahead. Collect a usbmon trace when you do it. That way we'll be able to see when URBs start and complete. You should also look at the contents of the debugging files for the OHCI controller after the hang occurs. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html