Re: Endpoint is not halted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 15, 2012 at 11:28:34AM +0530, ankit patel wrote:
> Hi Sarah,

Hi Ankit, thanks for the bug report.

>    We have found some problems during inserting device. It shows the error
> like
> 
>    -      [   85.694152] xhci_hcd 0000:01:00.0: Endpoint 0x84 not halted,
>    refusing to reset.
>    -      [   85.694156] xhci_hcd 0000:01:00.0: Endpoint 0x3 not halted,
>    refusing to reset.

Don't worry about those messages, they're harmless.

> it also shows error for LPM like
> 
>    - [   85.692284] usb 3-1: Parent hub missing LPM exit latency info.
>     Power management will be impacted.

That means your host probably doesn't support Link Power Management.
You won't get as good of power savings if it did, but it shouldn't
impact the overall behavior of the host.

> At last HC died error comes.
> 
>    - [  132.885582] xhci_hcd 0000:01:00.0: HC died; cleaning up
>    - [  132.885590] xhci_hcd 0000:01:00.0: xHCI host controller is dead.
> 
> 
>  We are using WD harddisk. I am attaching the log file. Please have a look.
> Does our Hard disk is faulty or our host has some bugs?.

I don't think your hard drive is faulty.  Either there's an xHCI driver
bug, or a host hardware bug.

The logs show that a transfer was received successfully:

> [   97.310269] xhci_hcd 0000:01:00.0: WARN halted endpoint, queueing URB anyway.
> [   97.310280] xhci_hcd 0000:01:00.0: Ignoring reset ep completion code of 1
> [   97.310285] xhci_hcd 0000:01:00.0: Successful Set TR Deq Ptr cmd, deq = @2f32f431
> [   97.310428] xhci_hcd 0000:01:00.0: ep 0x84 - asked for 96 bytes, 76 bytes untransferred
> [   97.310434] xhci_hcd 0000:01:00.0: Giveback URB f0a5d200, len = 20, expected = 96, status = -121

And then the host gave us an event with a DMA pointer to a transfer that
we didn't recognize:

> [   97.421922] xhci_hcd 0000:01:00.0: ERROR Transfer event TRB DMA ptr not part of current TD

Some time later, we print the endpoint rings, and then the SCSI layer
attempts to cancel the transfer.

> [  124.384026] xhci_hcd 0000:01:00.0: Poll event ring: 4294923392
> [  124.384035] xhci_hcd 0000:01:00.0: op reg status = 0x0
> [  124.384041] xhci_hcd 0000:01:00.0: ir_set 0 pending = 0x2
> [  124.384045] xhci_hcd 0000:01:00.0: HC error bitmask = 0x204
> [  124.384049] xhci_hcd 0000:01:00.0: Event ring:
> [  124.384056] xhci_hcd 0000:01:00.0: @000000002f2d6400 2f32e690 00000000 01000000 01068000
> [  124.384062] xhci_hcd 0000:01:00.0: @000000002f2d6410 2f32f8c0 00000000 01000000 01098000
> [  124.384067] xhci_hcd 0000:01:00.0: @000000002f2d6420 2f32f8d0 00000000 01000000 01098000
> [  124.384072] xhci_hcd 0000:01:00.0: @000000002f2d6430 2f32e6a0 00000000 01000000 01068000
> [  124.384077] xhci_hcd 0000:01:00.0: @000000002f2d6440 2f32f990 00000000 01000000 01098000
> [  124.384082] xhci_hcd 0000:01:00.0: @000000002f2d6450 2f32f9a0 00000000 01000000 01098000
> [  124.384087] xhci_hcd 0000:01:00.0: @000000002f2d6460 2f32e6b0 00000000 01000000 01068000
> [  124.384093] xhci_hcd 0000:01:00.0: @000000002f2d6470 2f32fb60 00000000 01000000 01098000 <--- bogus DMA pointer?
> [  124.384098] xhci_hcd 0000:01:00.0: @000000002f2d6480 2f32e550 00000000 01000000 01068001
> [  124.384103] xhci_hcd 0000:01:00.0: @000000002f2d6490 2f32f610 00000000 01000000 01098001
> [  124.384108] xhci_hcd 0000:01:00.0: @000000002f2d64a0 2f32f620 00000000 01000000 01098001
> [  124.384113] xhci_hcd 0000:01:00.0: @000000002f2d64b0 2f32e560 00000000 01000000 01068001
> [  124.384118] xhci_hcd 0000:01:00.0: @000000002f2d64c0 2f32f630 00000000 01000000 01098001
> [  124.384124] xhci_hcd 0000:01:00.0: @000000002f2d64d0 2f32f640 00000000 01000000 01098001
> [  124.384129] xhci_hcd 0000:01:00.0: @000000002f2d64e0 2f32e570 00000000 01000000 01068001
> [  124.384134] xhci_hcd 0000:01:00.0: @000000002f2d64f0 2f32f650 00000000 01000000 01098001
> [  124.384139] xhci_hcd 0000:01:00.0: @000000002f2d6500 2f32f660 00000000 01000000 01098001
> [  124.384144] xhci_hcd 0000:01:00.0: @000000002f2d6510 2f32e580 00000000 01000000 01068001
> [  124.384149] xhci_hcd 0000:01:00.0: @000000002f2d6520 2f32f670 00000000 01000000 01098001
> [  124.384154] xhci_hcd 0000:01:00.0: @000000002f2d6530 2f32f680 00000000 01000000 01098001
> [  124.384160] xhci_hcd 0000:01:00.0: @000000002f2d6540 2f32e590 00000000 01000000 01068001
> [  124.384165] xhci_hcd 0000:01:00.0: @000000002f2d6550 2f32f690 00000000 01000000 01098001
> [  124.384170] xhci_hcd 0000:01:00.0: @000000002f2d6560 2f32f6a0 00000000 01000000 01098001
> [  124.384175] xhci_hcd 0000:01:00.0: @000000002f2d6570 2f32e5a0 00000000 01000000 01068001
> [  124.384180] xhci_hcd 0000:01:00.0: @000000002f2d6580 2f32f6b0 00000000 01000000 01098001
> [  124.384185] xhci_hcd 0000:01:00.0: @000000002f2d6590 2f32f6c0 00000000 01000000 01098001
> [  124.384190] xhci_hcd 0000:01:00.0: @000000002f2d65a0 2f32e5b0 00000000 01000000 01068001
> [  124.384201] xhci_hcd 0000:01:00.0: @000000002f2d65b0 2f32f6d0 00000000 01000000 01098001
> [  124.384203] xhci_hcd 0000:01:00.0: @000000002f2d65c0 2f32f6e0 00000000 01000000 01098001
> [  124.384205] xhci_hcd 0000:01:00.0: @000000002f2d65d0 2f32e5c0 00000000 01000000 01068001
> [  124.384207] xhci_hcd 0000:01:00.0: @000000002f2d65e0 2f32f6f0 00000000 01000000 01098001
> [  124.384210] xhci_hcd 0000:01:00.0: @000000002f2d65f0 2f32f700 00000000 01000000 01098001
> [  124.384212] xhci_hcd 0000:01:00.0: @000000002f2d6600 2f32e5d0 00000000 01000000 01068001
> [  124.384214] xhci_hcd 0000:01:00.0: @000000002f2d6610 2f32f710 00000000 01000000 01098001
> [  124.384216] xhci_hcd 0000:01:00.0: @000000002f2d6620 2f32f720 00000000 01000000 01098001
> [  124.384218] xhci_hcd 0000:01:00.0: @000000002f2d6630 2f32e5e0 00000000 01000000 01068001
> [  124.384221] xhci_hcd 0000:01:00.0: @000000002f2d6640 2f32f730 00000000 01000000 01098001
> [  124.384223] xhci_hcd 0000:01:00.0: @000000002f2d6650 2f32f740 00000000 01000000 01098001
> [  124.384225] xhci_hcd 0000:01:00.0: @000000002f2d6660 2f32e5f0 00000000 01000000 01068001
> [  124.384227] xhci_hcd 0000:01:00.0: @000000002f2d6670 2f32f750 00000000 01000000 01098001
> [  124.384229] xhci_hcd 0000:01:00.0: @000000002f2d6680 2f32f760 00000000 01000000 01098001
> [  124.384232] xhci_hcd 0000:01:00.0: @000000002f2d6690 2f32e600 00000000 01000000 01068001
> [  124.384234] xhci_hcd 0000:01:00.0: @000000002f2d66a0 2f32f770 00000000 01000000 01098001
> [  124.384236] xhci_hcd 0000:01:00.0: @000000002f2d66b0 2f32f780 00000000 01000000 01098001
> [  124.384238] xhci_hcd 0000:01:00.0: @000000002f2d66c0 2f32e610 00000000 01000000 01068001
> [  124.384241] xhci_hcd 0000:01:00.0: @000000002f2d66d0 2f32f790 00000000 01000000 01098001
> [  124.384243] xhci_hcd 0000:01:00.0: @000000002f2d66e0 2f32f7a0 00000000 01000000 01098001
> [  124.384245] xhci_hcd 0000:01:00.0: @000000002f2d66f0 2f32e620 00000000 01000000 01068001
> [  124.384247] xhci_hcd 0000:01:00.0: @000000002f2d6700 2f32f7b0 00000000 01000000 01098001
> [  124.384249] xhci_hcd 0000:01:00.0: @000000002f2d6710 2f32e630 00000000 01000000 01068001
> [  124.384252] xhci_hcd 0000:01:00.0: @000000002f2d6720 2f32f7c0 00000000 01000000 01098001
> [  124.384254] xhci_hcd 0000:01:00.0: @000000002f2d6730 2f32e640 00000000 01000000 01068001
> [  124.384256] xhci_hcd 0000:01:00.0: @000000002f2d6740 2f32f7d0 00000000 01000000 01098001
> [  124.384258] xhci_hcd 0000:01:00.0: @000000002f2d6750 2f32f7e0 00000000 01000000 01098001
> [  124.384260] xhci_hcd 0000:01:00.0: @000000002f2d6760 2f32e650 00000000 01000000 01068001
> [  124.384263] xhci_hcd 0000:01:00.0: @000000002f2d6770 2f32f800 00000000 01000000 01098001
> [  124.384265] xhci_hcd 0000:01:00.0: @000000002f2d6780 2f32e660 00000000 01000000 01068001
> [  124.384267] xhci_hcd 0000:01:00.0: @000000002f2d6790 2f32f840 00000000 01000000 01098001
> [  124.384269] xhci_hcd 0000:01:00.0: @000000002f2d67a0 2f32f850 00000000 01000000 01098001
> [  124.384271] xhci_hcd 0000:01:00.0: @000000002f2d67b0 2f32e670 00000000 01000000 01068001
> [  124.384273] xhci_hcd 0000:01:00.0: @000000002f2d67c0 2f32f860 00000000 01000000 01098001
> [  124.384276] xhci_hcd 0000:01:00.0: @000000002f2d67d0 2f32e680 00000000 01000000 01068001
> [  124.384278] xhci_hcd 0000:01:00.0: @000000002f2d67e0 2f32f870 00000000 01000000 01098001
> [  124.384280] xhci_hcd 0000:01:00.0: @000000002f2d67f0 2f32f880 00000000 01000000 01098001
> [  124.384283] xhci_hcd 0000:01:00.0: Ring deq = ef2d6480 (virt), 0x2f2d6480 (dma)
> [  124.384284] xhci_hcd 0000:01:00.0: Ring deq updated 712 times
> [  124.384286] xhci_hcd 0000:01:00.0: Ring enq = ef2d6400 (virt), 0x2f2d6400 (dma)
> [  124.384288] xhci_hcd 0000:01:00.0: Ring enq updated 0 times
> [  124.384293] xhci_hcd 0000:01:00.0: ERST deq = 64'h2f2d6480

The ring in question that had issues:

> [  124.384725] xhci_hcd 0000:01:00.0: Dev 1 endpoint ring 8:
> [  124.384728] xhci_hcd 0000:01:00.0: @000000002f32f800 2f351000 00000000 0000000d 00000424
> [  124.384730] xhci_hcd 0000:01:00.0: @000000002f32f810 2f3ae000 00000000 00201000 00000414
> [  124.384732] xhci_hcd 0000:01:00.0: @000000002f32f820 2f3ac000 00000000 00181000 00000414
> [  124.384734] xhci_hcd 0000:01:00.0: @000000002f32f830 2f3ab000 00000000 00101000 00000414
> [  124.384736] xhci_hcd 0000:01:00.0: @000000002f32f840 2f3aa000 00000000 00081000 00000424
> [  124.384738] xhci_hcd 0000:01:00.0: @000000002f32f850 2f351000 00000000 0000000d 00000424
> [  124.384741] xhci_hcd 0000:01:00.0: @000000002f32f860 2f351000 00000000 0000000d 00000424
> [  124.384743] xhci_hcd 0000:01:00.0: @000000002f32f870 2f3f5000 00000000 00081000 00000424
> [  124.384745] xhci_hcd 0000:01:00.0: @000000002f32f880 2f351000 00000000 0000000d 00000424
> [  124.384747] xhci_hcd 0000:01:00.0: @000000002f32f890 2f3f4000 00000000 00201000 00000414
> [  124.384749] xhci_hcd 0000:01:00.0: @000000002f32f8a0 2f3ef000 00000000 00181000 00000414
> [  124.384752] xhci_hcd 0000:01:00.0: @000000002f32f8b0 2f3e5000 00000000 00101000 00000414
> [  124.384754] xhci_hcd 0000:01:00.0: @000000002f32f8c0 2f3e8000 00000000 00081000 00000424
> [  124.384756] xhci_hcd 0000:01:00.0: @000000002f32f8d0 2f351000 00000000 0000000d 00000424
> [  124.384758] xhci_hcd 0000:01:00.0: @000000002f32f8e0 2f3e3000 00000000 003e1000 00000414
> [  124.384760] xhci_hcd 0000:01:00.0: @000000002f32f8f0 2f3e9000 00000000 003e1000 00000414
> [  124.384762] xhci_hcd 0000:01:00.0: @000000002f32f900 2f3e2000 00000000 003e1000 00000414
> [  124.384765] xhci_hcd 0000:01:00.0: @000000002f32f910 2f3e4000 00000000 003e1000 00000414
> [  124.384767] xhci_hcd 0000:01:00.0: @000000002f32f920 2f3e1000 00000000 003e1000 00000414
> [  124.384769] xhci_hcd 0000:01:00.0: @000000002f32f930 2d007000 00000000 00381000 00000414
> [  124.384771] xhci_hcd 0000:01:00.0: @000000002f32f940 2f3e0000 00000000 00301000 00000414
> [  124.384773] xhci_hcd 0000:01:00.0: @000000002f32f950 2d004000 00000000 00281000 00000414
> [  124.384775] xhci_hcd 0000:01:00.0: @000000002f32f960 2d003000 00000000 00201000 00000414
> [  124.384778] xhci_hcd 0000:01:00.0: @000000002f32f970 2d002000 00000000 00181000 00000414
> [  124.384780] xhci_hcd 0000:01:00.0: @000000002f32f980 2d001000 00000000 00101000 00000414
> [  124.384782] xhci_hcd 0000:01:00.0: @000000002f32f990 2f21a000 00000000 00081000 00000424
> [  124.384784] xhci_hcd 0000:01:00.0: @000000002f32f9a0 2f351000 00000000 0000000d 00000424
> [  124.384786] xhci_hcd 0000:01:00.0: @000000002f32f9b0 2d008000 00000000 003e1000 00000414 <-- TD started here
> [  124.384789] xhci_hcd 0000:01:00.0: @000000002f32f9c0 2f219000 00000000 003e1000 00000414
> [  124.384791] xhci_hcd 0000:01:00.0: @000000002f32f9d0 2f2fe000 00000000 003e1000 00000414
> [  124.384793] xhci_hcd 0000:01:00.0: @000000002f32f9e0 2f3c4000 00000000 003e1000 00000414
> [  124.384795] xhci_hcd 0000:01:00.0: @000000002f32f9f0 2f2ba000 00000000 003e1000 00000414
> [  124.384797] xhci_hcd 0000:01:00.0: @000000002f32fa00 2f366000 00000000 003e1000 00000414
> [  124.384799] xhci_hcd 0000:01:00.0: @000000002f32fa10 2f3d7000 00000000 003e1000 00000414
> [  124.384802] xhci_hcd 0000:01:00.0: @000000002f32fa20 2f540000 00000000 003e2000 00000414
> [  124.384804] xhci_hcd 0000:01:00.0: @000000002f32fa30 2f7ee000 00000000 003e1000 00000414
> [  124.384806] xhci_hcd 0000:01:00.0: @000000002f32fa40 2f3e7000 00000000 003e1000 00000414
> [  124.384808] xhci_hcd 0000:01:00.0: @000000002f32fa50 2f3d4000 00000000 003e1000 00000414
> [  124.384810] xhci_hcd 0000:01:00.0: @000000002f32fa60 2d006000 00000000 003e1000 00000414
> [  124.384812] xhci_hcd 0000:01:00.0: @000000002f32fa70 2f3e6000 00000000 003e1000 00000414
> [  124.384815] xhci_hcd 0000:01:00.0: @000000002f32fa80 2f3de000 00000000 003e1000 00000414
> [  124.384817] xhci_hcd 0000:01:00.0: @000000002f32fa90 2f3f9000 00000000 003e1000 00000414
> [  124.384819] xhci_hcd 0000:01:00.0: @000000002f32faa0 2f3d2000 00000000 003e1000 00000414
> [  124.384821] xhci_hcd 0000:01:00.0: @000000002f32fab0 2f3df000 00000000 003e1000 00000414
> [  124.384823] xhci_hcd 0000:01:00.0: @000000002f32fac0 2f2dd000 00000000 003e1000 00000414
> [  124.384826] xhci_hcd 0000:01:00.0: @000000002f32fad0 2f318000 00000000 003e1000 00000414
> [  124.384828] xhci_hcd 0000:01:00.0: @000000002f32fae0 2f73c000 00000000 003e1000 00000414
> [  124.384830] xhci_hcd 0000:01:00.0: @000000002f32faf0 2f34e000 00000000 003e2000 00000414
> [  124.384832] xhci_hcd 0000:01:00.0: @000000002f32fb00 2f348000 00000000 00381000 00000414
> [  124.384834] xhci_hcd 0000:01:00.0: @000000002f32fb10 2f3c2000 00000000 00302000 00000414
> [  124.384836] xhci_hcd 0000:01:00.0: @000000002f32fb20 2f027000 00000000 00201000 00000414
> [  124.384839] xhci_hcd 0000:01:00.0: @000000002f32fb30 2f283000 00000000 00181000 00000414
> [  124.384841] xhci_hcd 0000:01:00.0: @000000002f32fb40 2f28a000 00000000 00101000 00000414
> [  124.384843] xhci_hcd 0000:01:00.0: @000000002f32fb50 2f241000 00000000 00081000 00000424 <-- TD ends here.
> [  124.384845] xhci_hcd 0000:01:00.0: @000000002f32fb60 2f3d0000 00000000 00081000 00000425
> [  124.384847] xhci_hcd 0000:01:00.0: @000000002f32fb70 2f351000 00000000 0000000d 00000425
> [  124.384850] xhci_hcd 0000:01:00.0: @000000002f32fb80 2f3d1000 00000000 00081000 00000425
> [  124.384852] xhci_hcd 0000:01:00.0: @000000002f32fb90 2f351000 00000000 0000000d 00000425
> [  124.384854] xhci_hcd 0000:01:00.0: @000000002f32fba0 30a4e000 00000000 00081000 00000425
> [  124.384856] xhci_hcd 0000:01:00.0: @000000002f32fbb0 2f351000 00000000 0000000d 00000425
> [  124.384858] xhci_hcd 0000:01:00.0: @000000002f32fbc0 2f364000 00000000 00081000 00000425
> [  124.384860] xhci_hcd 0000:01:00.0: @000000002f32fbd0 2f351000 00000000 0000000d 00000425
> [  124.384863] xhci_hcd 0000:01:00.0: @000000002f32fbe0 2f218000 00000000 00081000 00000425
> [  124.384865] xhci_hcd 0000:01:00.0: @000000002f32fbf0 2f32f400 00000000 00000000 00001803

The transfer event that we had issues with was:

> [  124.384093] xhci_hcd 0000:01:00.0: @000000002f2d6470 2f32fb60 00000000 01000000 01098000

That pointer (2f32fb60) is for the TRB *after* the last TRB of the
transfer.  The xHCI driver hadn't queued anything after that transfer,
so the hardware shouldn't have even owned that TRB.

Completion code is success, TRB type is 0x20 (32) which is a transfer
event, endpoint index is 9, slot ID 1.

So the host gave us a successful completion event with a bad TRB DMA
pointer.

And then the URB got canceled:

> [  127.840043] xhci_hcd 0000:01:00.0: Cancel URB f0a5d080, dev 1, ep 0x84, starting at offset 0x2f32f9b0
> [  127.840056] xhci_hcd 0000:01:00.0: // Ding dong!
> [  132.848014] xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command.
> [  132.848021] xhci_hcd 0000:01:00.0: Assuming host is dying, halting host.
> [  132.848029] xhci_hcd 0000:01:00.0: // Halt the HC

The host must have been pretty hosed, because it didn't respond to the
request to halt the host controller:

> [  132.885562] xhci_hcd 0000:01:00.0: Host not halted after 16000 microseconds.
> [  132.885564] xhci_hcd 0000:01:00.0: Non-responsive xHCI host is not halting.
> [  132.885565] xhci_hcd 0000:01:00.0: Completing active URBs anyway.
> [  132.885568] xhci_hcd 0000:01:00.0: Killing URBs for slot ID 1, ep index 0
> [  132.885570] xhci_hcd 0000:01:00.0: Killing URBs for slot ID 1, ep index 5
> [  132.885572] xhci_hcd 0000:01:00.0: Killing URBs for slot ID 1, ep index 8
> [  132.885579] xhci_hcd 0000:01:00.0: Calling usb_hc_died()
> [  132.885582] xhci_hcd 0000:01:00.0: HC died; cleaning up
> [  132.885590] xhci_hcd 0000:01:00.0: xHCI host controller is dead.
> [  132.885598] xhci_hcd 0000:01:00.0: set port reset, actual port 0 status  = 0x1311
> [  132.885608] usb 3-1: USB disconnect, device number 2
> [  132.940019] hub 3-0:1.0: hub_port_status failed (err = -19)

I would say this is a host controller hardware bug.  I'm not sure what
triggered it; maybe the number of TRBs in the TD?  You can try limiting
the number of TRBs that are submitted by the SCSI driver by changing the
value of sg_tablesize in drivers/usb/host/xhci.c:

int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
{
        struct xhci_hcd         *xhci;
        struct device           *dev = hcd->self.controller;
        int                     retval;
        u32                     temp;

        /* Accept arbitrarily long scatter-gather lists */
        hcd->self.sg_tablesize = ~0;

The host got out of sync after a 27-TRB TD, so you could change the
sg_tablesize to something smaller, like 25 or 20.  If you have trouble
modifying the kernel yourself, I can make a patch for you.

Are there any host controller firmware updates?  If so, I would suggest
you try updating the host's firmware and see if the issue goes away.

Is this an add-in card that you can replace with a different vendor or
newer chipset revision?  If all else fails, getting a new host
controller may be the only solution.

Can you send me the output of `sudo lspci -vvv` and
`sudo lspci -vvv -n`?  I assume this is a PCI host controller.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux