On 08.02.2016 15:49, Chris Bainbridge wrote:
The XHCI controller presents two USB buses to the system - one for USB 2 and one for USB 3. When only one bus is locked there is a race condition with two threads in hub_port_init: [ 8.984500] Call Trace: [ 8.985698] [<ffffffff81b9bab7>] schedule+0x37/0x90 [ 8.986918] [<ffffffff817da7cd>] usb_kill_urb+0x8d/0xd0 [ 8.988130] [<ffffffff8111e5e0>] ? wake_up_atomic_t+0x30/0x30 [ 8.989343] [<ffffffff817dafbe>] usb_start_wait_urb+0xbe/0x150 [ 8.990561] [<ffffffff817db10c>] usb_control_msg+0xbc/0xf0 [ 8.991766] [<ffffffff817d07de>] hub_port_init+0x51e/0xb70 [ 8.992964] [<ffffffff817d4697>] hub_event+0x817/0x1570 [ 8.994156] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620 [ 8.995342] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620 [ 8.996528] [<ffffffff810f4684>] worker_thread+0x64/0x4b0 [ 8.997707] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390 [ 8.998883] [<ffffffff810fa7f5>] kthread+0x105/0x120 [ 9.000056] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 [ 9.001241] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70 [ 9.002420] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 [ 9.870837] Call Trace: [ 9.875664] [<ffffffff817fd36d>] xhci_setup_device+0x53d/0xa40 [ 9.876871] [<ffffffff817fd87e>] xhci_address_device+0xe/0x10 [ 9.878068] [<ffffffff817d047f>] hub_port_init+0x1bf/0xb70 [ 9.879262] [<ffffffff811247ed>] ? trace_hardirqs_on+0xd/0x10 [ 9.880465] [<ffffffff817d4697>] hub_event+0x817/0x1570 [ 9.881668] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620 [ 9.882869] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620 [ 9.884074] [<ffffffff810f4684>] worker_thread+0x64/0x4b0 [ 9.885268] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390 [ 9.886457] [<ffffffff810fa7f5>] kthread+0x105/0x120 [ 9.887634] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 [ 9.888817] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70 [ 9.889995] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200 Which results from the two call chains: hub_port_init usb_get_device_descriptor usb_get_descriptor usb_control_msg usb_internal_control_msg usb_start_wait_urb usb_submit_urb / wait_for_completion_timeout / usb_kill_urb hub_port_init hub_set_address xhci_address_device xhci_setup_device The logged kernel errors are: [ 8.034843] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command [ 13.183701] usb 3-3: device descriptor read/all, error -110 On a test system this failure occurred on 6% of all boots. Hypothetically, it should perhaps be possible to set the address of the hub on one bus while the hub on the other bus already has an outstanding descriptor read request. But as this is not working reliably, fix it by locking the controller in hub_port_init to prevent parallel initialisation of both buses.
Most likely xhci is messed up after two device slots are in default state at the same time. This happens when both threads are in hub_port_init() have called hub_port_reset() The issue becomes visible when the the descriptor read and set address both fail after the port resets. xhci specs 4.5.3 has one tiny note about this: "Note: Software shall not transition more than one Device Slot to the Default State at a time" So to me, and from xhci pov this patch looks like the correct solution, but I might be missing some usb core side details. -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html