On 13.01.2016 00:01, Alan Stern wrote:
On Tue, 12 Jan 2016, Ivan P wrote:
I've uploaded the usbmon traces here:
https://www.dropbox.com/sh/0gldb4r4g6p4p5w/AAAdmHP_Slya3f440v9oe1qka?dl=0
One run tracing every bus (0u), one run tracing only the sandisk stick (2u).
Each trace is from starting to copy the files to the point it hangs
up, at which I attempt to cd into the mount point.
I looked at the second trace. It seems to indicate a bug in the xHCI
host controller hardware or driver.
Everything is okay almost up to the end. Here's where the trouble
starts:
ffff8802fd7bf180 1135243584 S Bo:2:004:2 -115 31 = 55534243 1cc50200 00100000 80000a28 0000000d e0000008 00000000 000000
ffff8802fd7bf180 1135243601 C Bo:2:004:2 0 31 >
ffff88028df546c0 1135243607 S Bi:2:004:1 -115 4096 <
ffff88028df546c0 1165404890 C Bi:2:004:1 -104 4096 = e83ad4f0 096d0965 b6e2cf54 9165f0e8 2a39d865 8e097d4d 2bef792c e0e7adaf
This shows the computer trying to read 4 KB of data from the device.
All of the data was received okay, but for some reason the transfer
didn't end properly. Instead, it timed out after 30 seconds and was
cancelled. That's the fundamental bug.
Attempts to recover by resetting the device failed (apparently due to a
bug in the device) and from that point on, nothing worked. The device
kept reporting failures for each command, but with no error code.
Since the original problem looks like an xHCI-related issue, maybe
Mathias can suggest some things to try.
Does this occur on xhci hosts from other verndors? How about older kernels? (before 4.3)
There was a change in 4.3 kernel (and older stable) in how the xhci driver
returns bulk in URBs. If transfers are short the driver won't give back the URB
immediately, instead it waits until it get a completion event for the last transfer block
in that transfer descriptor.
(we should get the event even if the transfer was short and never filled)
Turns out not all hosts send this second completion event, so that change will be reverted
commit e210c422b6fdd2dc123bedc588f399aefd8bf9de
xhci: don't finish a TD if we get a short transfer event mid TD
Even if it is supposed to only affect short transfer with data over 64k the symptoms you see
fits this area. If xhci driver doesn't return the URB, it will be canceled with ECONNRESET (-104)status
Does verbose debugging for xhci show anything? Enable it with:
echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html