Re: [RFC v3 7/8] xHCI: wait for dequeue pointer move pass link TRB

Andiry Xu <andiry.xu@xxxxxxx> · Fri, 9 Dec 2011 17:32:57 +0800

On 11/11/2011 04:36 AM, Sarah Sharp wrote:
> On Thu, Nov 10, 2011 at 05:12:04PM +0800, Andiry Xu wrote:
>> When trying to expand a ring, if enqueue pointer is behind dequeue pointer
>> and they're in the same segment, we can not expand the ring until the
>> dequeue pointer move pass the link TRB. This is because if the hardware
>> has not cached the link TRB, it will jump to the new segments, but the
>> data there is not right and the hardware may stop if the cycle_bit
>> indicates it's software owned.
>>
>> Wait for the dequeue pointer move pass the link TRB, and then it's safe
>> to expand the ring. Pending all the urbs submitted during the waiting period
>> and queue them after the ring successfully expanded. When an endpoint is
>> stopped, clear all the pending urbs.
> 
> I'm not sure why you're giving back pending URBs when the endpoint
> stops.  There's no reason that canceling a URB that has made it onto
> the ring should suddenly cancel all the URBs waiting to get on the ring.
> Only the watchdog timer running should cause all the pending URBs to get
> given back.
> 
> What I wanted you to do was check in urb_dequeue whether the URB to be
> canceled was in the pending queue and just take it out of the pending
> queue.  That you wouldn't have to even issue the stop endpoint command.
> You're going to have to decrement the pending_num_trbs when you take it
> out of the list, of course.
> 
> Also, why do you need both pending_num_trbs and waiting_deq_ptr?  When
> checking to see if the ring needs to be expanded, why not just check if
> the TRB count is non-zero or if the pending_urb list is empty?
> 
> I think there might be a couple other subtle issues, but I need to apply
> your patchset and check.
> 

After re-examine this patch, I found there is another design issue. The
mechanism when the special case (waiting for host walk around the link
TRB) triggered is:

1. Hold the urbs in a list, report to usbcore that they're successfully
submitted.
2. Waiting for the host walk around the link TRB.
3. Expand the ring and queue the pending urbs.

However, we can not guarantee step3 will succeed: xhci_ring_expansion()
may fail with low memory, the endpoint state may change, and queueing of
the urbs may fail. So usbcore gets the report that urb is submitted
successfully, while it's actually failed. I don't think this is good. I
may re-consider the design.

Can I hold the patch temporarily and submit the remaining patchset by
now? It should work without the patch since this is a case very hard to
trigger. As more and more guys encounter the ring full issue, I think
the patchset should be merged. Sorry for the delay.

Thanks,
Andiry

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html