Re: [PATCH 0/3 v4] ISOC supporting for USB 3.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2010-04-27 at 14:59 -0700, Sarah Sharp wrote:
> What the log shows is that the hardware gave back a transfer completion
> with a DMA pointer of all zeros (the suspect DMA = 0x0 line).  I can't
> tell what the completion code was, only that the endpoint ring that was
> indexed in the event had at least one TD on the ring.
> 
> After the transfer event with the zeroed DMA pointer, the host attempted
> to give back several TDs.  It looks like those events are for the same
> isoc ring, after the TD at the head of the endpoint list.  The ring
> handler doesn't complete the queued TD when it receives an event that
> looks erroneous, so the endpoint gets stuck at one TD.
> 
>From the log, that's possibly what happens.

> 
> Now, I see there's this note in the xHCI spec:
> 
> "Note: The TRB Pointer field of Missed Service Error Transfer Event may
> be ‘0’."
> 
> Section 4.10.3.2 talks about what happens when the host can't transfer
> the data within the requested service interval, and it says that it may
> give a transfer event completion code of "Missed Service Error" with a
> DMA pointer of zero.  I wonder if that's what I'm seeing?
> 
Probably. Ring Overrun and Underrun event also have DMA pointer of zero.
But it should be processed within the patch. 

> The DMA pointers in the transfer completion events do look a bit weird
> even with that theory:
> 
> Apr 27 13:42:45 xanatos kernel: [12080.477127] trb_in_td - suspect DMA = 0x0
> Apr 27 13:42:45 xanatos kernel: [12080.477291] trb_in_td - suspect DMA = 0x3cd9d890
> Apr 27 13:42:45 xanatos kernel: [12080.477291] trb_in_td - suspect DMA = 0x3cd9d890
> Apr 27 13:42:45 xanatos kernel: [12080.477411] trb_in_td - suspect DMA = 0x3cd9d8a0
> Apr 27 13:42:45 xanatos kernel: [12080.477537] trb_in_td - suspect DMA = 0x3cd9d8b0
> 
> Why would it giveback 0x3cd9d890 twice?

I only see 0x3cd9d890 in the full log for once. Please note the time
stamp - both 12080.477291. Maybe you just copy it for twice?

> I'm also wondering about this line:
> 
> Apr 27 13:42:09 xanatos kernel: [12044.124863] xhci_hcd 0000:05:00.0: WARN Event TRB for slot 1 ep 2 with no TDs queued?
> 
This line is strange to be here. A missed service error indicates the
xHC was unable to complete the data transfer associated with an Isoch
TRB within the ESIT. but if an endpoint has no tds queued, what does the
xHC has to process? 

> If this is related to the missed service error, then you need to add
> checking for that completion code before attempting to find the endpoint
> segment that the event came from.  So there would need to be some
> completion code checking before this chunk of code:
> 
> 	/* Is this a TRB in the currently executing TD? */
> 	event_seg = trb_in_td(ep_ring->deq_seg, ep_ring->dequeue,
> 			td->last_trb, event_dma, false);
> 	if (!event_seg) {
> 		/* HC is busted, give up! */
> 		xhci_err(xhci, "ERROR Transfer event TRB DMA ptr not part of current TD\n");
> 
> The xHCI driver should just giveback the first TD in the list when it
> sees that completion code, since the host is going to advance to the
> next TD.  I'm not sure what completion code you should give the URB.
> Maybe -EREMOTEIO for a short transfer and set the transferred_length to
> zero?
> 
In this case, it should be treated as a short transfer and the
actual_length should be set to 0. But only to the corresponding packet,
not the whole URB.

I have been trying different USB isoc devices these days, and finally I
seem to be able to reproduce the issue with Hauppauge WinTV-HVR950,
which is a USB2.0 TV tuner. I used tvtime to play TV. It works fine, but
sometimes when I quit the application, it shows "ERROR Transfer event
TRB DMA ptr not part of current TD" message. Sorry the log file is too
big and I only show the interesting part.

Here is the suspicious part I catch:   

Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388580] xhci_hcd 0000:02:00.0: Miss service interval error
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388582] xhci_hcd 0000:02:00.0: event = ffff88002177b610
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388583] xhci_hcd 0000:02:00.0: td->last_trb = ffff880021cb2420
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388586] xhci_hcd 0000:02:00.0: Event ring:
......
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388668] xhci_hcd 0000:02:00.0: @2177b5f0 21cb2230 00000000 0d0003c4 01058000
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388670] xhci_hcd 0000:02:00.0: @2177b600 2177b1c0 00000000 01000000 01008400
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388673] xhci_hcd 0000:02:00.0: @2177b610 00000000 00000000 17000000 01058000
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388676] xhci_hcd 0000:02:00.0: @2177b620 21cb2440 00000000 01000000 01058000
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388679] xhci_hcd 0000:02:00.0: @2177b630 21cb2450 00000000 01000000 01058000

Apr 29 01:44:23 brienza-laptop kernel: [ 1739.388753] xhci_hcd 0000:02:00.0: Endpoint ring:
......
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389345] xhci_hcd 0000:02:00.0: @21cb2230 1a707c44 00000000 00040b4c 80001425
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389348] xhci_hcd 0000:02:00.0: @21cb2240 00000000 00000000 00000000 00002001
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389351] xhci_hcd 0000:02:00.0: @21cb2250 00000000 00000000 00000000 00002001
......
(There are all no-op trbs here. Seems cancelled tds)
......
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389419] xhci_hcd 0000:02:00.0: @21cb2400 00000000 00000000 00000000 00002001
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389422] xhci_hcd 0000:02:00.0: @21cb2410 00000000 00000000 00000000 00002001
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389425] xhci_hcd 0000:02:00.0: @21cb2420 1a720000 00000000 00040b4c 80001425
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389428] xhci_hcd 0000:02:00.0: @21cb2430 1a720b4c 00000000 00040b4c 80001425
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389430] xhci_hcd 0000:02:00.0: @21cb2440 1a721698 00000000 00040b4c 80001425
Apr 29 01:44:23 brienza-laptop kernel: [ 1739.389433] xhci_hcd 0000:02:00.0: @21cb2450 1a7221e4 00000000 00040b4c 80001425

Here is what happen: td @21cb2230 is processed OK and there is a
corresponding event @2177b5f0. Ignore event @2177b6f0, it's for control
endpoint. Now there're no-op TRBs on the endpoint ring, probably
cancelled TDs. event @2177b610 is a missed service error event. maybe
it's due to the no-op TRBs on the ring? 
When the xHC continues to process, you can see two valid isoc tds
@21cb2420 and 21cb2430 have no corresponding events in the event ring.
The next event is for td @21cb2440. The td @21cb2420 is stucked and many
"ERROR Transfer event TRB DMA ptr not part of current TD" messages burst
out.

Now that's the trick part. When submit one missed service error event,
the xHC actually missed two isoc tds. (According to the spec, in this
case there should be two missed service error events here - correct me
if I'm wrong.) Even if advance to next td when encounter missed service
error event, in this case it will go to td @21cb2430 - unfortunately
still stucked. I'll do more tests to see if this is always what happens
and what should do about this. You can check your log and see if this is
what happening at your side.

Andiry Xu


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux