On Wed, Sep 26, 2012 at 5:50 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 25 Sep 2012, Sarah Sharp wrote: > >> Alan, I'm wondering if the xHCI ring expansion is causing issues with >> USB hard drives under xHCI. Testing with a Buffalo USB 3.0 hard drive >> with an NEC uPD720200 xHCI host, I see that the usb-storage and SCSI >> initialization produces I/O errors on random sectors in 3.4.0, 3.4.6, >> and 3.5.0. I can't get those errors to be reproduced in 3.3.1. >> >> The xHCI ring expansion was added in 3.4, and we changed the xHCI's >> sg_tablesize: >> >> int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks) >> { >> ... >> /* Accept arbitrarily long scatter-gather lists */ >> hcd->self.sg_tablesize = ~0; >> >> The usb-storage driver sets the tablesize thus: >> >> static unsigned int usb_stor_sg_tablesize(struct usb_interface *intf) >> { >> struct usb_device *usb_dev = interface_to_usbdev(intf); >> >> if (usb_dev->bus->sg_tablesize) { >> return usb_dev->bus->sg_tablesize; >> } >> return SG_ALL; >> } >> >> I notice that SG_ALL is set to SCSI_MAX_SG_SEGMENTS, which is only 128. >> Should we be passing an arbitrarily large number to the SCSI core? > > Yes, there's no reason not to. The block layer will make sure that > each individual request has a sufficiently small number of segments. > >> There's some wording in include/scsi/scsi.h about also limiting the >> number of chained sgs to 2048. I'm wondering if we're hitting some bugs >> in the SCSI layer because we're setting the sg_tablesize so high. > > I doubt it. Anyway, this stuff is handled by the block layer now, not > the SCSI layer. If you look through drivers/scsi, you'll see that > SG_ALL is used only in various SCSI interface drivers, not in the core. > >> Alternately, we could be hitting bugs in the USB 3.0 firmware when we >> attempt to issue a read or write that's too big. The read on Adrian's >> hard drive failed on a bigger read request (122880 bytes). It would be >> interesting to see if it works fine if the xHCI sg_tablesize is limited. >> I'm going to try that with my own drive on 3.5.4 and see if it helps. > > There were examples in the earlier usbmon traces where 122880-byte > reads succeeded, for whatever that's worth... > > I doubt very much that you are anywhere close to hitting that limit. > If a 120-KB transfer has more than 128 SG segments then on average each > segment would be under 1024 bytes, a lot smaller than a page, which > seems unlikely. I don't think I've ever seen a transfer needing more > than about 8 segments. > > Alan Stern > Ok, back to vanilla 3.4.11, disabled CONFIG_USB_XHCI_HCD_DEBUGGING .. I still see 2012-09-26T19:52:16.661604+03:00 d3xt3r01 kernel: [ 1213.416759] usb 3-2.4: reset SuperSpeed USB device number 11 using xhci_hcd 2012-09-26T19:52:16.674632+03:00 d3xt3r01 kernel: [ 1213.429351] xhci_hcd 0000:04:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88011d3c6980 2012-09-26T19:52:16.674665+03:00 d3xt3r01 kernel: [ 1213.429363] xhci_hcd 0000:04:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88011d3c69c0 T: Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 8 Spd=5000 MxCh= 4 D: Ver= 3.00 Cls=09(hub ) Sub=00 Prot=03 MxPS= 9 #Cfgs= 1 P: Vendor=2109 ProdID=0810 Rev= 3.74 S: Manufacturer=VIA Labs, Inc. S: Product=4-Port USB 3.0 Hub C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 2mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=13(Int.) MxPS= 2 Ivl=4096ms T: Bus=02 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 4 Spd=480 MxCh= 4 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2109 ProdID=3431 Rev= 2.74 S: Product=USB2.0 Hub C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=256ms T: Bus=03 Lev=02 Prnt=08 Port=00 Cnt=01 Dev#= 9 Spd=5000 MxCh= 0 D: Ver= 3.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1058 ProdID=1140 Rev=10.03 S: Manufacturer=Western Digital S: Product=My Book 1140 S: SerialNumber=5743415A4144303235323133 C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 2mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms T: Bus=03 Lev=02 Prnt=08 Port=03 Cnt=04 Dev#= 11 Spd=5000 MxCh= 0 D: Ver= 3.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1058 ProdID=1140 Rev=10.03 S: Manufacturer=Western Digital S: Product=My Book 1140 S: SerialNumber=574D415A4135343330323937 C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 2mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms So I need to cat 3u .. right ? Available at http://d3xt3r01.tk/~dexter/usbmon/1348678668_3u After the copy .. I see 2012-09-26T19:52:51.477641+03:00 d3xt3r01 kernel: [ 1248.232213] hub 3-2:1.0: Cannot enable port 4. Maybe the USB cable is bad? 2012-09-26T19:52:52.003074+03:00 d3xt3r01 kernel: [ 1248.757636] sd 16:0:0:0: Device offlined - not ready after error recovery 2012-09-26T19:52:52.003081+03:00 d3xt3r01 kernel: [ 1248.757652] sd 16:0:0:0: [sdd] Unhandled error code 2012-09-26T19:52:52.003088+03:00 d3xt3r01 kernel: [ 1248.757656] sd 16:0:0:0: [sdd] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK 2012-09-26T19:52:52.003095+03:00 d3xt3r01 kernel: [ 1248.757663] sd 16:0:0:0: [sdd] CDB: Read(10): 28 00 17 ab ab e0 00 00 f0 00 2012-09-26T19:52:52.003101+03:00 d3xt3r01 kernel: [ 1248.757680] end_request: I/O error, dev sdd, sector 397126624 2012-09-26T19:52:52.003107+03:00 d3xt3r01 kernel: [ 1248.757721] sd 16:0:0:0: rejecting I/O to offline device 2012-09-26T19:52:52.003112+03:00 d3xt3r01 kernel: [ 1248.757730] sd 16:0:0:0: [sdd] killing request 2012-09-26T19:52:52.003118+03:00 d3xt3r01 kernel: [ 1248.757750] sd 16:0:0:0: rejecting I/O to offline device 2012-09-26T19:52:52.003125+03:00 d3xt3r01 kernel: [ 1248.757769] sd 16:0:0:0: rejecting I/O to offline device 2012-09-26T19:52:52.003131+03:00 d3xt3r01 kernel: [ 1248.757788] sd 16:0:0:0: rejecting I/O to offline device 2012-09-26T19:52:52.003138+03:00 d3xt3r01 kernel: [ 1248.757814] sd 16:0:0:0: [sdd] Unhandled error code 2012-09-26T19:52:52.003145+03:00 d3xt3r01 kernel: [ 1248.757821] sd 16:0:0:0: [sdd] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK 2012-09-26T19:52:52.003152+03:00 d3xt3r01 kernel: [ 1248.757829] sd 16:0:0:0: [sdd] CDB: Read(10): 28 00 17 ab ac d0 00 00 10 00 2012-09-26T19:52:52.003159+03:00 d3xt3r01 kernel: [ 1248.757849] end_request: I/O error, dev sdd, sector 397126864 2012-09-26T19:52:52.018669+03:00 d3xt3r01 kernel: [ 1248.773433] usb 3-2.4: USB disconnect, device number 11 While trying to copy ( using mc ) from /dev/sdb1 ( mounted /media/sdd1 ) to /dev/sdd1 ( mounted /media/sde1 ) .. Sorry for the confusion .. because I don't know which ones will get in what order I use uuids in fstab .. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html