On Fri, 4 Mar 2016, Rian Hunter wrote: > On Fri, 4 Mar 2016, Alan Stern wrote: > > On Fri, 4 Mar 2016, Rian Hunter wrote: > > > >> Thanks for the great tip. I used tcpdump on the bus of my device and > >> waited for a couple days for the effect to happen again. > >> > >> I found that whenever I would get the "usb 2-3: reset SuperSpeed USB > >> device number 2 using xhci_hcd" what was happening at the protocol > >> level was: > >> > >> HOST: DO READ > >> DEVICE: CONFIRMED > >> HOST: SEND ME DATA > >> DEVICE: <DATA> > >> HOST: SEND ME STATUS > >> DEVICE: SCSI CHECK CONDITION > >> > >> After the "Check Condition" the stack would initial a USB reset. > > > > More details would help, such as the actual usbmon output for one of > > those failed commands (plus the following data). > > > > This sequence is attached as "spurious_reset.pcap" Here is the relevant portion, translated into the usbmon text format: 4166c880 0.280079 S Bo:2:003:2 -115 31 = 55534243 7c333100 00e00000 80020a28 001ef9d1 00000070 00000000 000000 4166c880 0.280123 C Bo:2:003:2 0 31 > Send a READ(10) command for 57344 bytes (112 blocks, assuming the disk uses 512-byte blocks). eba42640 0.280149 S Bi:2:003:1 -115 57344 < eba42640 0.293002 C Bi:2:003:1 -32 49152 = c50636e3 28bff5e6 62a13394 30a2b1b8 76265667 88e30a14 81949db7 3e6ecc7b The device sends back only 49152 bytes of data, followed by a STALL. 4166c880 0.293096 S Co:2:003:0 s 02 01 0000 0081 0000 0 4166c880 0.293145 C Co:2:003:0 0 0 Clear the halt condition. 4166c880 0.293164 S Bi:2:003:1 -115 13 < 4166c880 0.293217 C Bi:2:003:1 0 13 = 55534253 7c333100 27b50312 3d Receive the status. The response is not meaningful; dCSWDataResidue and bCSWStatus are both garbage. In particular, since status is not 1, the device did _not_ report Check Condition. At this point there is little that usb-storage can do other than a reset. > >> Now that was all well and fine, no interruptions in service. What > >> eventually caused the entire device to disconnect was this sequence: > >> > >> HOST: DO READ > >> DEVICE: CONFIRMED > >> HOST: SEND ME DATA > >> DEVICE: <DATA> > >> HOST: SEND ME STATUS > >> (300 seconds pass...) > >> DEVICE: USB URB ECONNRESET > >> > >> I believe what is happening here is that the firmware of the bridge > >> device timed out waiting for the SCSI status coming from the actual > >> HDD (after enough "check condition" return codes the device decided to > >> die). The bridge firmware sends a final ECONNRESET then it decides to > >> disconnect completely. > > > > No, that doesn't sound right. SCSI commands typically have a 30-second > > timeout, so there should have been a reset after 30 seconds, not a > > disconnect after 300. > > > > This sequence is attached as "disconnect.pcap" Note that before all of > this I had changed the command timeout of the block device using the > equivalent of: > > # echo 300 > /sys/block/sdb/device/timeout Okay, that explains the long delay. Incidentally, the ECONNRESET did not come from the bridge. It came from usb-storage, when the transfer was aborted. > >> From "https://en.wikipedia.org/wiki/SCSI_check_condition" it says that > >> when a "check condition" status is returned, the device goes into > >> a special "contigent allegiance condition" state and the host *should* > >> retrieve more information using a "Request Sense" command. The Linux > >> stack does not seem to be doing this. > > > > Not true. It does do this, very faithfully. > > > > Ah yes, you're right, now that I'm actually looking at the > code. Though, I'm not sure if the transport layer is returning > "USB_STOR_TRANSPORT_FAILED" or "USB_STOR_TRANSPORT_ERROR." From the > "spurious_reset.pcap" capture, as you'll see, no REQUEST_SENSE is > being sent. In that trace, the return code would have been USB_STOR_TRANSPORT_ERROR. usb-storage did not send a REQUEST SENSE command because the bridge did not send Check Condition status. > Excited to see what you'll be able to glean from the captures, Thanks > for your help! You're welcome. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html