Re: xhci_hcd 0000:0b:00.0: WARN: transfer error on endpoint

Andiry Xu <andiry.xu@xxxxxxx> · Thu, 29 Mar 2012 18:27:34 +0800

On 03/29/2012 07:11 AM, Sarah Sharp wrote:
> On Tue, Mar 27, 2012 at 07:29:23PM +0200, Martin Mokrejs wrote:
>> Hi,
>>   I already mentioned on this list I am having some issues with my new 2.7TiB hard drive
>> which I bought at about a month ago. I crashed ext3 fs two days ago on it when I filled
>> up the disk with the data. Yesterday I learned Seagate released a new firmware for it.
>> I flashed it from CC46 (I think) to CC4C. I created an ext4 filesystem now, just in case.
>> I started copying fo teh data and two hours later I see on the console:
>>
>> vostro kernel: journal commit I/O error
>>
>>   Aside from the fact I do not understand why is it coming from "JBD2" it seems it caused
>> the drive to be unmounted. Is that a part of ext4 driver?
>>
>>   Anyway, it looks the whole story started here:
>>
>> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: transfer error on endpoint
> 
> Which kernel version are you running?  These useless warning messages
> were removed in 3.3, ISTR.  They're harmless.
> 

This one is a critical error (USB transaction error) and is not removed.

>> Mar 27 18:10:23 vostro kernel: usb 4-1: reset SuperSpeed USB device number 12 using xhci_hcd
>> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: short transfer on control ep
>> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff540
>> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff580
>> Mar 27 18:10:23 vostro kernel: usb 4-1: Successful Endpoint Configure command
>> Mar 27 18:10:54 vostro kernel: usb 4-1: reset SuperSpeed USB device number 12 using xhci_hcd
> 
> The usb-storage driver decided to reset the device, for whatever reason.
> You can enable CONFIG_USB_STORAGE_DEBUG to find out which command caused
> it to be reset.
> 
>> Mar 27 18:10:54 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: short transfer on control ep
>> Mar 27 18:10:54 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff540
>> Mar 27 18:10:54 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff580
>> Mar 27 18:10:54 vostro kernel: usb 4-1: Successful Endpoint Configure command
>> Mar 27 18:11:04 vostro kernel: usb 4-1: reset SuperSpeed USB device number 12 using xhci_hcd
> 
> And then it gets reset again, 10 seconds later.
> 
>> Mar 27 18:11:04 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: short transfer on control ep
>> Mar 27 18:11:04 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff540
>> Mar 27 18:11:04 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff580
>> Mar 27 18:11:04 vostro kernel: usb 4-1: Successful Endpoint Configure command
>> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0002
>> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: warm reset port 1
> 
> I wonder why you needed to have a warm reset for the port?
> 

I think it's because the link state is SS_INACTIVE after reset.

Thanks,
Andiry

>> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: port 1 not warm reset yet, waiting 50ms
>> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: port 1 not warm reset yet, waiting 200ms
>> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: port 1 not warm reset yet, waiting 200ms
> 
> Mmm, the warm port reset timed out, which isn't very good.  Perhaps your
> TI host is quirky in that respect?
> 
>> Mar 27 18:11:05 vostro kernel: hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0002
>> Mar 27 18:11:06 vostro kernel: hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0002
>> Mar 27 18:11:06 vostro kernel: hub 4-0:1.0: port 1, status 02a0, change 0001, 5.0 Gb/s
>> Mar 27 18:11:06 vostro kernel: usb 4-1: USB disconnect, device number 12
> 
> And then the device disconnected.  Sounds like flaky hardware.  Maybe if
> you figure out what SCSI command is failing for the device, we can apply
> a storage device quirk, and we could avoid having to reset the device.
> 
>> Mar 27 18:11:06 vostro kernel: usb 4-1: unregistering device
>> Mar 27 18:11:06 vostro kernel: usb 4-1: unregistering interface 4-1:1.0
>> Mar 27 18:11:09 vostro kernel: sd 17:0:0:0: Device offlined - not ready after error recovery
>> Mar 27 18:11:09 vostro kernel: ice sdc1, logical block 204754273
>> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754274
>> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754275
>> Mar 27 18:11:09 vostro kernel: EXT4-fs warning (device sdc1): ext4_end_bio:244: I/O error writing to inode 51118127 (offset 30695424 size 126976 starting block 204754532)
>> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754276
>> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754277
>>
>>
>>   Would anybody please comment whether the logs from xHCI side look sane?
> 
> It helps if you Cc me (I'm listed in MAINTAINERS as the xHCI driver
> maintainer).  Otherwise the linux-usb mail ends up in another folder and
> can get ignored for days if I get busy. :)
> 
>> I have attached the
>> complete log. From that it looks that mkfs.ext4 caushed the 3.2.12 kernel to complain
>> with "mount: page allocation failure: order:5, mode:0xd0" but that is another story.
>> Now I wonder how can I prove the disk is at fault and not the USWB3.0-to-SATAII
>> bridge or linux kernel. The disk is rather hot (compared to another drive of exactly same
>> model  number) I bought 2 days ago (and which had the newer firmware, and therefore I
>> went to upgrade the older-one with a tool from seagate website).
> 
> Can you connect the disk to a desktop box and run the same sort of
> workload?  Connecting it via a USB 2.0 to SATA connector might not be
> helpful, because if the drive is sensitive to the speed at which it's
> accessed, then the error might not show up because USB 2.0 is going to
> be much slower than USB 3.0.
> 
>> 0b:00.0 USB controller: Texas Instruments Device 8241 (rev 02) (prog-if 30 [XHCI])
>>         Subsystem: Dell Device 04b3
>>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Latency: 0, Cache Line Size: 64 bytes
>>         Interrupt: pin A routed to IRQ 16
>>         Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
>>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>>         Capabilities: [40] Power Management version 3
>>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
>>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>>                 Address: 0000000000000000  Data: 0000
>>         Capabilities: [70] Express (v2) Endpoint, MSI 00
>>                 DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
>>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>>                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>                         MaxPayload 128 bytes, MaxReadReq 512 bytes
>>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
>>                 LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
>>                         ClockPM+ Surprise- LLActRep- BwNot-
>>                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
>>                         ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>                 LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>                 DevCap2: Completion Timeout: Not Supported, TimeoutDis+
>>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
>>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>>                          Compliance De-emphasis: -6dB
>>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>         Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
>>                 Vector table: BAR=2 offset=00000000
>>                 PBA: BAR=2 offset=00001000
>>         Capabilities: [100 v2] Advanced Error Reporting
>>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>                 CESta:  RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+
>>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>>         Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00
>>         Kernel driver in use: xhci_hcd
> 
> Oh, I see.  It's a TI host.  Yeah, I haven't done much testing on that
> particular host controller, so I don't know if it works well.
> 
> Sarah Sharp
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html