Re: XHCI: Ring expansion failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> On Wed, Nov 6, 2013 at 12:33 PM,  <hemantk@xxxxxxxxxxxxxx> wrote:
>>>> On Wed, Nov 6, 2013 at 12:53 AM,  <hemantk@xxxxxxxxxxxxxx> wrote:
>>>>> Hi
>>>>>
>>>>> By performing iterative port suspend and resume (which results in
>>>>> function
>>>>> suspend and resume), ring expansion failure is observed. Attached
>>>>> device
>>>>> has multiple interfaces for which interface host drivers are
>>>>> unlinking
>>>>> the
>>>>> urbs during function suspend and submitting urbs during resume.
>>>>>
>>>>> For the cases when dequeue ptr is close to the end of deq segment,
>>>>> value
>>>>> of num_trbs_in_deq_seg (= ring->dequeue - ring->deq_seg->trbs)
>>>>> becomes
>>>>> high when function suspend is over. At the time of resume this high
>>>>> value
>>>>> is causing to trigger ring expansion in the middle of submitting
>>>>> urbs(ring->num_trbs_free < num_trbs + num_trbs_in_deq_seg).  All the
>>>>> interface host drivers only request one TRB per urb (num_trb =1).
>>>>> Ring
>>>>> expansion logic is doubling the num of previously allocated segments
>>>>> every
>>>>> time. This is causing the num of segments to grow at fast pace (at
>>>>> the
>>>>> time of failure, one of the ep rings num_seg was 128), even though it
>>>>> is
>>>>> also increasing num_trbs_free.
>>>>>
>>>>
>>>> Sounds like a bug. Why num_trbs_in_deq_seg becomes high? How many urbs
>>>> are submitted during resume? Normally only isoc transfer triggers ring
>>>> expansion with multiple trbs per urb.
>>>
>>> Let me explain it little further what is being observed in terms of enq
>>> ,
>>> deq ptr:-
>>>
>>> For example
>>> 1)      Let?s say First seg of the ring starts with 0xed0d3000 which
>>> means enq
>>> ptr and deq ptr are pointing to this address.
>>> 2)      Interface driver queuing 10 urbs (one urb represents one trb)
>>> enq ptr
>>> goes to ed0d30a0.
>>> 3)      Interface suspend happened. All 10 urbs are going to get
>>> unlinked. For
>>> the first urb to be unlinked, it is added to ep->cancelled_td_list  and
>>> Stop ep cmd was called (rest of the urbs added to ep->cancelled_td_list
>>> before stop ep cmd completed).
>>> 4)      When stop ep cmd is completed, for 9 urbs td_to_noop() is
>>> called.  For
>>> the first urb (for which stop ep cmd was called),
>>> xhci_find_new_dequeue_state() gets called which updates deq ptr to next
>>> trb (0x ed0d3010) so set tr deq ptr cmd is issued to xhci to set its
>>> deq
>>> ptr to next trb(0xed0d3010).
>>> 5)      In set tr deq cmd completion handler
>>> update_ring_for_set_deq_completion() gets called which will just
>>> increments num_trbs_free to only one, but we unlinked all the urbs. So
>>> when interface suspend finished we have enq ptr pointing to 0x ed0d30a0
>>> and dep ptr pointing to 0x ed0d3010. With num_trbs_free only
>>> incremented
>>> by one.
>>> 6)      At the time of interface resume we re-queue 10 urbs back with
>>> num_trbs_free (just incremented 1 during suspend). This means
>>> num_trbs_free is incrementing very slowly every time interface suspend
>>> happens. And it is very easy to get into a situation where
>>> room_on_ring()
>>> returning 0 as deq ptr will be reaching to the end of the seg and
>>> causing
>>> num_trbs_in_deq_seg to go high enough that ring->num_trbs_free <
>>> num_trbs
>>> + num_trbs_in_deq_seg condition becomes true.
>>>
>>
>> So it's because the suspend/resume loops and urbs are enqueued and
>> unlinked, deq_ptr does not follow enq_ptr. This is expected since
>> deq_ptr and num_trbs_free increase in inc_deq(), which is called by
>> irq handler. As there is no transfer on this ring, deq_ptr will not
>> get updated.
> Considering the suspend resume scenario, with this behavior don't you
> think ring expansion will be triggered very often and for no reason (as we
> are just doing suspend and resume).
>
> In handle_stopped_endpoint() when traversing to ep->cancelled_td_list is
> it better to call xhci_find_new_dequeue_state() for last td and call
> td_to_noop() for all previous tds. Doing that we will update deq ptr to
> set to enq ptr with correct value of num_trbs_free when interface suspends
> finishes.
i think instead of what suggested above, we need to somehow update
num_trbs_free to the correct value any time before interface driver is
going to submit the first urb as a result of interface resume. As after
resume xhci is going to skip no op trbs and starts fetching the transfer
trbs on the ring.

>>
>> I'm still curious about the failure of ring expansion. 128 segments
>> occupy 128KB, which should not be a big deal. Ring expansion is
>> expected to run until there is no memory to allocate. Does it fail
>> because of allocation failure?
> yes, xhci_ring_expansion() returns -ENOMEM. There are 19 rings allocated.
>>
>> Thanks,
>> Andiry
>>
>>>
>>>>> Even for the cases where we observe data transfer as well as suspend
>>>>> &
>>>>> resume (which are normal scenarios) ring expansion failure is easily
>>>>> observed due to large number of segments being allocated for ever.
>>>>> This
>>>>> means triggering of ring expansion will result in failure of further
>>>>> expansion at some point of time.
>>>>>
>>>>> I was trying to think of some options to reduce ring expansion with
>>>>> respect to suspend resume scenario: -
>>>>> 1)      Can we reset enq and deq ptr to point to ep_ring first seg
>>>>> once
>>>>> all the
>>>>> urbs are unlinked (function suspend)?
>>>>> 2)      Can we free the segments at the time of suspend and allocate
>>>>> segments
>>>>> back when resume happens?
>>>>>
>>>>
>>>> These does not point to the root cause of ring expansion failure.
>>>>
>>>> Thanks,
>>>> Andiry
>>>>
>>>>> In general, as per the spec can we also implement shrinking of
>>>>> transfer
>>>>> ring (4.9.2.4).
>>>>>
>>>>> Please help to provide your comments on the options above (which can
>>>>> easily fit in to the current design) also feel free to add better
>>>>> options
>>>>> to consider to avoid ring expansion failures which can be added
>>>>> easily
>>>>> to
>>>>> the current implementation.
>>>>>
>>>>> Currently I am using 3.10 kernel and back ported recent fixes on
>>>>> xhci.
>>>>>
>>>>> Thanks,
>>>>> Hemant
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-usb"
>>>>> in
>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>
>
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux