Re: crash in usb_hc_died+0x16 when unplugging usb-c dock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07.09.2016 17:53, Alan Stern wrote:
On Wed, 7 Sep 2016, Mathias Nyman wrote:

I'm still seeing occasional problems. For example, when I unplugged the dock last night, it seems to have wedged some things, and then plugging it back in didn't work. See some logs below.


I ran a show-blocked-tasks after plugging the dock back in:


Looks like there is the usb_hub_wq that tries to handle the disconnect event
at the same time as the pci remove code is removing xhci hosts (and connected devices)

Sep  7 09:03:30 fred kernel: [83879.383356] Workqueue: usb_hub_wq hub_event
Sep  7 09:03:30 fred kernel: [83879.383395] Call Trace:
Sep  7 09:03:30 fred kernel: [83879.383416]  [<ffffffff81855fa5>] schedule+0x35/0x80
Sep  7 09:03:30 fred kernel: [83879.383427]  [<ffffffff8163433d>] usb_kill_urb+0x8d/0xc0
Sep  7 09:03:30 fred kernel: [83879.383444]  [<ffffffff810c4490>] ? wake_atomic_t_function+0x60/0x60
Sep  7 09:03:30 fred kernel: [83879.383454]  [<ffffffff81633076>] usb_hcd_flush_endpoint+0x126/0x190
Sep  7 09:03:30 fred kernel: [83879.383465]  [<ffffffff81635fbb>] usb_disable_endpoint+0x9b/0xb0


Sep  7 09:03:30 fred kernel: [83879.383686] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
Sep  7 09:03:30 fred kernel: [83879.383717] Call Trace:
Sep  7 09:03:30 fred kernel: [83879.383728]  [<ffffffff81855fa5>] schedule+0x35/0x80
Sep  7 09:03:30 fred kernel: [83879.383738]  [<ffffffff8185624e>] schedule_preempt_disabled+0xe/0x10
Sep  7 09:03:30 fred kernel: [83879.383748]  [<ffffffff81857ea9>] __mutex_lock_slowpath+0xb9/0x130
Sep  7 09:03:30 fred kernel: [83879.383758]  [<ffffffff81857f3f>] mutex_lock+0x1f/0x30
Sep  7 09:03:30 fred kernel: [83879.383766]  [<ffffffff8162b951>] usb_disconnect+0x51/0x280
Sep  7 09:03:30 fred kernel: [83879.383776]  [<ffffffff816314f0>] usb_remove_hcd+0xd0/0x240

First guess would be there is something wrong with killing the urb.
usb_hub_wq takes the roothub device lock first, and then ends up waiting for usb_kill_urb forever.

I agree.  Probably xhci-hcd is waiting for the controller to do
something before it will give back the cancelled URB.  But since the
controller has been removed, it never does anything.

This would block the pci remove path when usb_remove_hcd calls usb_disconnect, which
tries to take the roothub lock as well.

Doing a usbfs read on a usb device also takes the roothub device lock, which could explain
why lsusb is blocked.

Just an idea, need to check the code in more detail to see if it's a possible cause

ehci-hcd includes checks in several places for ehci->rh_state ==
RH_STATE_RUNNING.  The removal pathway sets ehci->rh_state to
RH_STATE_HALTED.  As a result, the driver avoids waiting for things
that will never happen.


Yes, seems that there are two things that need to be done for xhci here.

First part is doing the similar thing to xhci_urb_dequeue as ehci does, make sure
host is alive before queuing any stop endpoint commands. It does check if PCI reads return
0xffffffff or host is XHCI_STATE_DYING, but we could detect a remove a lot earlier.
Second part is to make sure that the canceled URB is given back if the stop endpoint command
times out.
Currently the xhci_stop_endpoint_command_watchdog() function may return without
giving back canceled urbs, causing usb_kill_urb() to wait on wait_event(usb_kill_urb_queue, ..) forever with
locks held, blocking the pci remove thread.

I'll start writing a patch

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux