Turns out I DO have another USB3.0 device, an external HDD enclosure. The PC it is connected to is so weak that it is unable to reach USB3.0 speeds with it, so I forgot about it. I've tested the controller on my PC with that USB HDD and there doesn't seem to be any issues unlike with the Sandisk stick - so it seems to be the fault of the USB stick and not the controller per se. How likely is this misbehavior to be fixed at a later date? I'm asking because I have about a week left to be able to return the USB stick, so if it's unlikely a workaround for the stick will be found/made, I'd rather not keep it. Having to boot windows just to copy files to/from it kind of negates all of the speed aspects it is offering. On Thu, Jan 14, 2016 at 10:17 PM, Ivan P <chrnosphered@xxxxxxxxx> wrote: > I've tested linux-lts (4.1.15), and the same thing happens (see > usbmon_sandisk_on_lts trace) > The verbose debugging is also included (dmesg_xhci_ex). I don't have > another USB3.0 device, > but I tested with an USB2.0 one and I'm getting intermittent one > second freezes of the whole PC > when writing to that USB2.0 stick. Using an USB2.0 port, nothing of > the sort happens. However, > even with the freezes, the copy process finishes correctly (see > usbmon_patriot trace) > > I've also tried the stick on another PC that has the NEC Corporation > uPD720200 USB 3.0 > Host Controller (rev 03), but that one's even older, and > unsurprisingly shows a similar picture > (see usbmon_sandisk_on_nec) > > The new traces are in the same dropbox folder as before. > > On Thu, Jan 14, 2016 at 6:08 PM, Mathias Nyman > <mathias.nyman@xxxxxxxxxxxxxxx> wrote: >> On 13.01.2016 00:01, Alan Stern wrote: >>> >>> On Tue, 12 Jan 2016, Ivan P wrote: >>> >>>> I've uploaded the usbmon traces here: >>>> https://www.dropbox.com/sh/0gldb4r4g6p4p5w/AAAdmHP_Slya3f440v9oe1qka?dl=0 >>>> >>>> One run tracing every bus (0u), one run tracing only the sandisk stick >>>> (2u). >>>> Each trace is from starting to copy the files to the point it hangs >>>> up, at which I attempt to cd into the mount point. >>> >>> >>> I looked at the second trace. It seems to indicate a bug in the xHCI >>> host controller hardware or driver. >>> >>> Everything is okay almost up to the end. Here's where the trouble >>> starts: >>> >>>> ffff8802fd7bf180 1135243584 S Bo:2:004:2 -115 31 = 55534243 1cc50200 >>>> 00100000 80000a28 0000000d e0000008 00000000 000000 >>>> ffff8802fd7bf180 1135243601 C Bo:2:004:2 0 31 > >>>> ffff88028df546c0 1135243607 S Bi:2:004:1 -115 4096 < >>>> ffff88028df546c0 1165404890 C Bi:2:004:1 -104 4096 = e83ad4f0 096d0965 >>>> b6e2cf54 9165f0e8 2a39d865 8e097d4d 2bef792c e0e7adaf >>> >>> >>> This shows the computer trying to read 4 KB of data from the device. >>> All of the data was received okay, but for some reason the transfer >>> didn't end properly. Instead, it timed out after 30 seconds and was >>> cancelled. That's the fundamental bug. >>> >>> Attempts to recover by resetting the device failed (apparently due to a >>> bug in the device) and from that point on, nothing worked. The device >>> kept reporting failures for each command, but with no error code. >>> >>> Since the original problem looks like an xHCI-related issue, maybe >>> Mathias can suggest some things to try. >>> >> >> Does this occur on xhci hosts from other verndors? How about older kernels? >> (before 4.3) >> >> There was a change in 4.3 kernel (and older stable) in how the xhci driver >> returns bulk in URBs. If transfers are short the driver won't give back the >> URB >> immediately, instead it waits until it get a completion event for the last >> transfer block >> in that transfer descriptor. >> (we should get the event even if the transfer was short and never filled) >> >> Turns out not all hosts send this second completion event, so that change >> will be reverted >> >> commit e210c422b6fdd2dc123bedc588f399aefd8bf9de >> xhci: don't finish a TD if we get a short transfer event mid TD >> Even if it is supposed to only affect short transfer with data over 64k the >> symptoms you see >> fits this area. If xhci driver doesn't return the URB, it will be canceled >> with ECONNRESET (-104)status >> >> Does verbose debugging for xhci show anything? Enable it with: >> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control >> >> -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html