Re: xhci_hcd 0000:0b:00.0: WARN: transfer error on endpoint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 27, 2012 at 07:29:23PM +0200, Martin Mokrejs wrote:
> Hi,
>   I already mentioned on this list I am having some issues with my new 2.7TiB hard drive
> which I bought at about a month ago. I crashed ext3 fs two days ago on it when I filled
> up the disk with the data. Yesterday I learned Seagate released a new firmware for it.
> I flashed it from CC46 (I think) to CC4C. I created an ext4 filesystem now, just in case.
> I started copying fo teh data and two hours later I see on the console:
> 
> vostro kernel: journal commit I/O error
> 
>   Aside from the fact I do not understand why is it coming from "JBD2" it seems it caused
> the drive to be unmounted. Is that a part of ext4 driver?
> 
>   Anyway, it looks the whole story started here:
> 
> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: transfer error on endpoint

Which kernel version are you running?  These useless warning messages
were removed in 3.3, ISTR.  They're harmless.

> Mar 27 18:10:23 vostro kernel: usb 4-1: reset SuperSpeed USB device number 12 using xhci_hcd
> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: short transfer on control ep
> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff540
> Mar 27 18:10:23 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff580
> Mar 27 18:10:23 vostro kernel: usb 4-1: Successful Endpoint Configure command
> Mar 27 18:10:54 vostro kernel: usb 4-1: reset SuperSpeed USB device number 12 using xhci_hcd

The usb-storage driver decided to reset the device, for whatever reason.
You can enable CONFIG_USB_STORAGE_DEBUG to find out which command caused
it to be reset.

> Mar 27 18:10:54 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: short transfer on control ep
> Mar 27 18:10:54 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff540
> Mar 27 18:10:54 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff580
> Mar 27 18:10:54 vostro kernel: usb 4-1: Successful Endpoint Configure command
> Mar 27 18:11:04 vostro kernel: usb 4-1: reset SuperSpeed USB device number 12 using xhci_hcd

And then it gets reset again, 10 seconds later.

> Mar 27 18:11:04 vostro kernel: xhci_hcd 0000:0b:00.0: WARN: short transfer on control ep
> Mar 27 18:11:04 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff540
> Mar 27 18:11:04 vostro kernel: xhci_hcd 0000:0b:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800073ff580
> Mar 27 18:11:04 vostro kernel: usb 4-1: Successful Endpoint Configure command
> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0002
> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: warm reset port 1

I wonder why you needed to have a warm reset for the port?

> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: port 1 not warm reset yet, waiting 50ms
> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: port 1 not warm reset yet, waiting 200ms
> Mar 27 18:11:04 vostro kernel: hub 4-0:1.0: port 1 not warm reset yet, waiting 200ms

Mmm, the warm port reset timed out, which isn't very good.  Perhaps your
TI host is quirky in that respect?

> Mar 27 18:11:05 vostro kernel: hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0002
> Mar 27 18:11:06 vostro kernel: hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0002
> Mar 27 18:11:06 vostro kernel: hub 4-0:1.0: port 1, status 02a0, change 0001, 5.0 Gb/s
> Mar 27 18:11:06 vostro kernel: usb 4-1: USB disconnect, device number 12

And then the device disconnected.  Sounds like flaky hardware.  Maybe if
you figure out what SCSI command is failing for the device, we can apply
a storage device quirk, and we could avoid having to reset the device.

> Mar 27 18:11:06 vostro kernel: usb 4-1: unregistering device
> Mar 27 18:11:06 vostro kernel: usb 4-1: unregistering interface 4-1:1.0
> Mar 27 18:11:09 vostro kernel: sd 17:0:0:0: Device offlined - not ready after error recovery
> Mar 27 18:11:09 vostro kernel: ice sdc1, logical block 204754273
> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754274
> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754275
> Mar 27 18:11:09 vostro kernel: EXT4-fs warning (device sdc1): ext4_end_bio:244: I/O error writing to inode 51118127 (offset 30695424 size 126976 starting block 204754532)
> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754276
> Mar 27 18:11:09 vostro kernel: Buffer I/O error on device sdc1, logical block 204754277
> 
> 
>   Would anybody please comment whether the logs from xHCI side look sane?

It helps if you Cc me (I'm listed in MAINTAINERS as the xHCI driver
maintainer).  Otherwise the linux-usb mail ends up in another folder and
can get ignored for days if I get busy. :)

> I have attached the
> complete log. From that it looks that mkfs.ext4 caushed the 3.2.12 kernel to complain
> with "mount: page allocation failure: order:5, mode:0xd0" but that is another story.
> Now I wonder how can I prove the disk is at fault and not the USWB3.0-to-SATAII
> bridge or linux kernel. The disk is rather hot (compared to another drive of exactly same
> model  number) I bought 2 days ago (and which had the newer firmware, and therefore I
> went to upgrade the older-one with a tool from seagate website).

Can you connect the disk to a desktop box and run the same sort of
workload?  Connecting it via a USB 2.0 to SATA connector might not be
helpful, because if the drive is sensitive to the speed at which it's
accessed, then the error might not show up because USB 2.0 is going to
be much slower than USB 3.0.

> 0b:00.0 USB controller: Texas Instruments Device 8241 (rev 02) (prog-if 30 [XHCI])
>         Subsystem: Dell Device 04b3
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
>                 DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>                         MaxPayload 128 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
>                 LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
>                         ClockPM+ Surprise- LLActRep- BwNot-
>                 LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
>                         ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>                 DevCap2: Completion Timeout: Not Supported, TimeoutDis+
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>         Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
>                 Vector table: BAR=2 offset=00000000
>                 PBA: BAR=2 offset=00001000
>         Capabilities: [100 v2] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00
>         Kernel driver in use: xhci_hcd

Oh, I see.  It's a TI host.  Yeah, I haven't done much testing on that
particular host controller, so I don't know if it works well.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux