Re: Spurious Mass Storage Device Resets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 29 Feb 2016, Rian Hunter wrote:

> Hello,
> 
> I own a JBOD SATA<->USB 3.0 bridge. All information about this device 
> and my kernel version is below.
> 
> My trouble is that every so often this device will undergo a virtual 
> USB disconnect and then be reconnected. This will cause all drives to 
> disappear and reappear on the system. All previously mounted file 
> systems will have to be remounted.
> 
> Here is the relevant dmesg upon initially plugging in the device:
> 
> [400295.378851] usb 2-3: new SuperSpeed USB device number 3 using xhci_hcd
> [400295.396512] usb 2-3: New USB device found, idVendor=152d, idProduct=0567
> [400295.396524] usb 2-3: New USB device strings: Mfr=10, Product=11, SerialNumber=5
> [400295.396531] usb 2-3: Product: USB to ATA/ATAPI Bridge
> [400295.396537] usb 2-3: Manufacturer: JMicron
> [400295.396542] usb 2-3: SerialNumber: 152D00539000
> [400295.398121] usb-storage 2-3:1.0: USB Mass Storage device detected
> [400295.402320] usb-storage 2-3:1.0: Quirks match for vid 152d pid 0567: 5000000
> [400295.402399] scsi host4: usb-storage 2-3:1.0
> [400296.401394] scsi 4:0:0:0: Direct-Access     WL1000GS A6472            0125 PQ: 0 ANSI: 6
> [400296.402428] scsi 4:0:0:1: Direct-Access     WL1000GS A6472            0125 PQ: 0 ANSI: 6
> [400296.403322] scsi 4:0:0:2: Direct-Access     WL1000GS A6472            0125 PQ: 0 ANSI: 6
> [400296.404113] scsi 4:0:0:3: Direct-Access     WL1000GS A6472            0125 PQ: 0 ANSI: 6
> [400296.404920] scsi 4:0:0:4: Direct-Access     WL1000GS A6472            0125 PQ: 0 ANSI: 6
> [400296.405762] scsi 4:0:0:5: Direct-Access     ST2000DL 004 HD204UI      0125 PQ: 0 ANSI: 6
> [400296.406576] scsi 4:0:0:6: Direct-Access     WL1000GS A6472            0125 PQ: 0 ANSI: 6
> 
> Here is the relevant dmesg when the device disconnects:
> 
> [383320.013453] usb 2-3: reset SuperSpeed USB device number 2 using xhci_hcd
> [383337.784897] usb 2-3: reset SuperSpeed USB device number 2 using xhci_hcd
...
> [396598.497860] usb 2-3: reset SuperSpeed USB device number 2 using xhci_hcd
> [399452.105930] xhci_hcd 0000:00:14.0: WARN Event TRB for slot 3 ep 2 with no TDs queued?
> [399821.971271] usb 2-3: reset SuperSpeed USB device number 2 using xhci_hcd
> [399955.972112] usb 2-3: reset SuperSpeed USB device number 2 using xhci_hcd
> [399956.184142] usb 2-3: reset SuperSpeed USB device number 2 using xhci_hcd
> [400266.037316] usb 2-3: Device not responding to setup address.
> [400266.244922] usb 2-3: Device not responding to setup address.
> [400266.445391] usb 2-3: device not accepting address 2, error -71
> [400270.434213] usb usb2-port3: Cannot enable. Maybe the USB cable is bad?
> [400274.422951] usb usb2-port3: Cannot enable. Maybe the USB cable is bad?
> [400278.411675] usb usb2-port3: Cannot enable. Maybe the USB cable is bad?
> [400278.412144] usb 2-3: USB disconnect, device number 2
> 
> Afterward the device connects again.
> 
> Another user of this same hardware claims that this happens when the 
> bridge encounters a faulty (or slow-to-respond) HDD. When the HDD is 
> slow to respond, the bridge resets 
> itself. (http://forum.mediasonic.ca/viewtopic.php?f=58&t=3115#p13119)
> 
> Initially I accepted this explanation but now I am starting to 
> question it. For one, all of my HDDs seem to be in good shape 
> according to their S.M.A.R.T. attributes and S.M.A.R.T. self-checks.
> 
> Another thing that looks fishy to me are the excessive "reset" 
> messages in dmesg. Looks like sometimes the reset works and sometimes 
> it doesn't. There are many reset attempts in this specific dmesg but 
> in previous resets there could be as few as one.
> 
> My current guess is some kind of spurious error is happening in 
> usb_stor_invoke_transport(), causing a call to usb_stor_port_reset(), 
> usb_reset_device(), usb_reset_and_verify_device(), and ultimately 
> hub_port_init(), which is causing the "reset" message. My guess 
> is that the kernel is timing out on a USB command to the device, 
> invoking a reset, and then timing out on a successful reset since the 
> device may still be busy doing something else.
> 
> Is this possible? Does the USB stack call reset on a command timeout? 

Well, the combination of usb-storage and the SCSI stack does.  
usb-storage itself doesn't have any notion of commands timing out; it
merely does a reset whenever the SCSI stack tells it to abort an
ongoing command.

> Is there a way to increase such a timeout?

No.  The values are already extremely generous.  For example, reading 
240 KB (!) has a 30-second timeout.

> Additionally, I'm not sure what the "WARN" line means.

I don't think it's particularly relevant.  Also, I remember seeing 
something about changing it (or the conditions under which it gets 
printed) recently.

Anyway, you can get more information about the resets and their causes 
if you collect a usbmon trace.  See Documentation/usb/usbmon.txt for 
instructions.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux