On 10.04.2018 12:15, russianneuromancer@xxxxx wrote:
Hello!
On Dell Venue 8 Pro 5855 tablet installing tlp or running "powertop
--
auto-tune" cause "xHCI host controller not responding, assume dead"
error, when error happen two integrated USB devices (Bluetooth
adapter
and LTE modem) disappear until reboot. First time this issue was
observer in Linux 4.13 and still present in Linux 4.16.
Blacklisting
both "Linux Foundation 3.0 root hub" from autosuspend in tlp
configuration is workaround for this issue, however on other
devices
tlp works fine without blacklisting usb hub autosuspend, and on
this
tablet there was no such issue before (at least in Linux ~4.8-4.12
range) so I assume there is regression somewhere.
Is there any related commits between 4.12 and 4.13 that I could try
to revert?
In 4.12 there was a added sensitivity to react to hotplug removed
xhc controllers, i.e. if we read 0xffffffff from a xhci register
we assume host is removed and start cleaning up.
commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
xhci: Rework how we handle unresponsive or hoptlug removed hosts
You can try to revert that, but as a final solution we should
find the real rootcause
How issue looks like in logs:
[ 227.258385] xhci_hcd 0000:00:14.0: xHC is not running.
[ 329.671544] xhci_hcd 0000:00:14.0: xHC is not running.
[ 416.695796] xhci_hcd 0000:00:14.0: xHC is not running.
The "xHC is not running" is the xhci driver handing a port event
interrupt for a resuming port, but whole host controller is not
running.
We stop the host controller in xhci_suspend(), and start it in
xhci_resume()
Attaching a patch that improves preventing xhci host suspend during
USB2 resume signaling.
Could help, worth a shot.
[ 416.695862] xhci_hcd 0000:00:14.0: xHCI host controller not
responding, assume dead
This means xhci_hc_died() was called, many possible places.
Adding the code below could give a hint:
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-
ring.c
index daa94c3..51fb3d0 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -900,7 +900,8 @@ void xhci_hc_died(struct xhci_hcd *xhci)
if (xhci->xhc_state & XHCI_STATE_DYING)
return;
- xhci_err(xhci, "xHCI host controller not responding, assume
dead\n");
+ xhci_err(xhci, "%ps: xHCI host controller not responding,
assume dead\n",
+ __builtin_return_address(0));
xhci->xhc_state |= XHCI_STATE_DYING;
xhci_cleanup_command_queue(xhci);
[ 416.695900] xhci_hcd 0000:00:14.0: HC died; cleaning up
[ 416.696052] usb 1-3: USB disconnect, device number 2
[ 416.815610] cdc_mbim 1-3:1.12 wwp0s20u3i12: unregister
'cdc_mbim'
usb-0000:00:14.0-3, CDC MBIM
[ 416.847934] usb 1-4: USB disconnect, device number 3
After that Bluetooth adapter and LTE modem disappear from lsusb
output,
while xHCI controller itself remain visible.
we stop the host activity in xhci_hc_died(), no usb devices under
this host will work.
Complete dmesg: https://paste.fedoraproject.org/paste/7aMpVGLfZ82zp
pdGs
56Oqg
lsusb -v: https://paste.fedoraproject.org/paste/c7y8GisC13YdzcYE9B-
JIw
dsdt.dsl: https://paste.fedoraproject.org/paste/8g6mp2dafypUkFT4sa4
3iA
xhci traces and dynamic debug could help:
mount -t debugfs none /sys/kernel/debug
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
echo -n 'module xhci_hcd =p' >
/sys/kernel/debug/dynamic_debug/control
-Mathias