Re: Random xHCI HC died on device disconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03.10.2017 18:27, Kristian Evensen wrote:
On Tue, Oct 3, 2017 at 4:51 PM, Kristian Evensen
<kristian.evensen@xxxxxxxxx> wrote:
Disabling the timer caused a different error to be triggered. Instead
of "HC died...", I now get the following message looping over and
over:

[16870.871935] qmi_wwan 4-1:1.4: Tx URB error: -19

I was not thinking clearly when I wrote the email. The reason this
message kept looping over and over was that I kept trying to
communicate with the modem. When I stopped my tool, the message
stopped. However, the host controller is still dead, so even if I try
to disconnect the device using GPIO, nothing happens. I thought the
log of when the error occurred was lost due to limited buffers, but I
was wrong and I have the following in dmesg when the error strikes:

[15986.400431] usb 4-1: USB disconnect, device number 13
[15986.400454] qmi_wwan 4-1:1.4: nonzero urb status received: -71
[15986.411366] qmi_wwan 4-1:1.4: wdm_int_callback - 0 bytes
[15986.416692] qmi_wwan 4-1:1.4: wdm_int_callback - usb_submit_urb
failed with result -19
[15986.424816] option1 ttyUSB0: GSM modem (1-port) converter now
disconnected from ttyUSB0
[15986.432886] option 4-1:1.0: device disconnected
[15986.437647] option1 ttyUSB1: GSM modem (1-port) converter now
disconnected from ttyUSB1
[15986.445765] option 4-1:1.1: device disconnected
[16001.110698] xhci-hcd f10f8000.usb3: Stopped the command ring
failed, maybe the host is dead
[16001.119077] xhci-hcd f10f8000.usb3: Abort command ring failed
[16001.124949] xhci-hcd f10f8000.usb3: HC died; cleaning up
[16100.854819] qmi_wwan 4-1:1.4: Tx URB error: -19
[16105.854944] qmi_wwan 4-1:1.4: Tx URB error: -19
[16110.855052] qmi_wwan 4-1:1.4: Tx URB error: -19
[16115.855159] qmi_wwan 4-1:1.4: Tx URB error: -19
[16120.855285] qmi_wwan 4-1:1.4: Tx URB error: -19
[16125.855396] qmi_wwan 4-1:1.4: Tx URB error: -19

I am bit surprised to see the HC-related messages. Maybe I missed a
place where the timer is stopped or something, time to investigate!


This is the xhci->cmd_timer (delayed work) that has a five second timeout
for the currently processing command on the command ring.
When triggered it will abort the current command by stopping the command ring
and remove/move past the current command.

Logs shows the command first timed out, and xhci then failing to stop the command ring.
when trying to abort the command.

To me it looks like xHC ends up in a state that we can't recover from without resetting xHC.
xhci Module reload or rebinding device and driver is needed

-Mathias


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux