On 23.08.2016 16:13, Greg KH wrote:
On Tue, Aug 23, 2016 at 01:54:05PM +0300, Mathias Nyman wrote:
On 23.08.2016 02:21, Jose Marino wrote:
I'm using my phone (Nexus 5X running Android) to tether a USB connection to my laptop (XPS 15 9550). I plug the phone through the USB-C connection and in the phone I select USB tethering. Initially things look normal: a usb0 network interface appears in the laptop and it tries to get an IP with dhcp. However, I observe two different behaviors depending on whether it's a fresh boot, or I have suspend/resumed the laptop. In a fresh boot everything works fine, I get an IP and the connection works as expected. If I unplug the phone, everything also works as expected.
However, after a suspend/resume cycle, I plug the phone in but the laptop never connects to it. The usb0 interface still appears, but the dhcp daemon is unable to get any response and finally times out. The fun part happens when I unplug the phone. I consistently get a kernel panic.
I managed to get some logs of the oops+panic from pstore. Find them attached. In this particular situation this is what I did:
- Boot laptop (archlinux with kernel 4.7.2)
- Suspend/resume
- Plug Nexus 5X
- After a few seconds unplug Nexus 5X
I filed a bug report about this: https://bugzilla.kernel.org/show_bug.cgi?id=153551
The Dell XPS 9550 has an additional xhci controller for handling the type-C port.
This controller is hotplug removed from the PCI bus when the last USB type-c
device is disconnected.
xhci driver, and usb core it seems is not really designed with this in mind.
The USB core can handle this just fine.
xhci driver will suddenly start reading ffffffff from PCI.
Which means the device is gone, and you need to handle it properly. We
fixed up ehci and ohci for this years ago (they were on hotplug busses).
For every PCI read, you need to verify that the data is correct, that's
the way that any PCI driver needs to work in a "modern" system.
This is an XHCI issue, not a USB core issue :)
Yes, reading ffffffff and reacting properly to it is purely xhci.
I've been looking at issues related to this. Currently there is at least one similar case with mass storage
where we see the device release function being called for the mass storage interface device _after_ we
freed all memory related to both xhci hcd's. bug for that is here:
https://bugzilla.kernel.org/show_bug.cgi?id=120241
usb devices with their children should be synchronously removed before hcd's are freed, but seems
that is not the case, at least not for the device release function for the interface device.
Once you notice that your PCI device is gone, you need to start tearing
things down as soon as possible. Or just stop things and wait for the
PCI core to come around and remove you from the system. That's probably
much more simple and I think is what was done for EHCI.
That part should be doable, but the part where the interface device release is called after hcd
is freed still puzzles me. As Alan suggested I need to check if the reference counting is correct.
A horrible workaround to hide this issue was to sleep for a second or two before freeing the hcd memory,
this lets some pending work finish before hcds disappear. (more info in that bug report)
Yeah, that's not a good idea :)
Just a intermediate step to find which way to continue debugging, not a solution.
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html