Re: xhci DWC3 flavor problem

Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> · Fri, 20 May 2016 16:56:34 +0300

On 19.05.2016 18:42, Joao Pinto wrote:

After a few moments the schedule problem happen again:

# INFO: task kworker/0:1:349 blocked for more than 120 seconds.
       Not tainted 4.6.0-rc5 #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1     D ffffff8008086c60     0   349      2 0x00000000
Workqueue: usb_hub_wq hub_event
Call trace:
[<ffffff8008086c60>] __switch_to+0xc8/0xd4
[<ffffff8008638774>] __schedule+0x18c/0x5c8
[<ffffff8008638be8>] schedule+0x38/0x98
[<ffffff800863b71c>] schedule_timeout+0x160/0x1ac
[<ffffff8008639714>] wait_for_common+0xac/0x150
[<ffffff80086397cc>] wait_for_completion+0x14/0x1c
[<ffffff8008489998>] xhci_alloc_dev+0xf4/0x2a0
[<ffffff800844ffd0>] usb_alloc_dev+0x68/0x2cc
[<ffffff800845696c>] hub_event+0x784/0x11f4
[<ffffff80080ce444>] process_one_work+0x130/0x2f4
[<ffffff80080ce65c>] worker_thread+0x54/0x434
[<ffffff80080d40fc>] kthread+0xd4/0xe8
[<ffffff8008085e10>] ret_from_fork+0x10/0x40
INFO: task kworker/0:1:349 blocked for more than 120 seconds.
       Not tainted 4.6.0-rc5 #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1     D ffffff8008086c60     0   349      2 0x00000000
Workqueue: usb_hub_wq hub_event
Call trace:
[<ffffff8008086c60>] __switch_to+0xc8/0xd4
[<ffffff8008638774>] __schedule+0x18c/0x5c8
[<ffffff8008638be8>] schedule+0x38/0x98
[<ffffff800863b71c>] schedule_timeout+0x160/0x1ac
[<ffffff8008639714>] wait_for_common+0xac/0x150
[<ffffff80086397cc>] wait_for_completion+0x14/0x1c
[<ffffff8008489998>] xhci_alloc_dev+0xf4/0x2a0
[<ffffff800844ffd0>] usb_alloc_dev+0x68/0x2cc
[<ffffff800845696c>] hub_event+0x784/0x11f4
[<ffffff80080ce444>] process_one_work+0x130/0x2f4
[<ffffff80080ce65c>] worker_thread+0x54/0x434
[<ffffff80080d40fc>] kthread+0xd4/0xe8
[<ffffff8008085e10>] ret_from_fork+0x10/0x40

So Chris' patch did not solve this problem either.

Thanks.

Looks like there still are some issue in cleaning up
pending commands after host dies.

This shouldn't happen, when host dies we clean up the command
ring and call completion for all pending commands, we set the state
to DISABLED or HALTED to prevent new commands from being queued.

But this is all long after the first command times out, ring abortion
fails, and we kill the host, right? So the bigger concern for you is
why the command never complete in the first place

-Mathias

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html