Re: XHCI: Ring expansion failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Wed, Nov 6, 2013 at 12:33 PM,  <hemantk@xxxxxxxxxxxxxx> wrote:
>>> On Wed, Nov 6, 2013 at 12:53 AM,  <hemantk@xxxxxxxxxxxxxx> wrote:
>>>> Hi
>>>>
>>>> By performing iterative port suspend and resume (which results in
>>>> function
>>>> suspend and resume), ring expansion failure is observed. Attached
>>>> device
>>>> has multiple interfaces for which interface host drivers are unlinking
>>>> the
>>>> urbs during function suspend and submitting urbs during resume.
>>>>
>>>> For the cases when dequeue ptr is close to the end of deq segment,
>>>> value
>>>> of num_trbs_in_deq_seg (= ring->dequeue - ring->deq_seg->trbs) becomes
>>>> high when function suspend is over. At the time of resume this high
>>>> value
>>>> is causing to trigger ring expansion in the middle of submitting
>>>> urbs(ring->num_trbs_free < num_trbs + num_trbs_in_deq_seg).  All the
>>>> interface host drivers only request one TRB per urb (num_trb =1). Ring
>>>> expansion logic is doubling the num of previously allocated segments
>>>> every
>>>> time. This is causing the num of segments to grow at fast pace (at the
>>>> time of failure, one of the ep rings num_seg was 128), even though it
>>>> is
>>>> also increasing num_trbs_free.
>>>>
>>>
>>> Sounds like a bug. Why num_trbs_in_deq_seg becomes high? How many urbs
>>> are submitted during resume? Normally only isoc transfer triggers ring
>>> expansion with multiple trbs per urb.
>>
>> Let me explain it little further what is being observed in terms of enq
>> ,
>> deq ptr:-
>>
>> For example
>> 1)      Let?s say First seg of the ring starts with 0xed0d3000 which
>> means enq
>> ptr and deq ptr are pointing to this address.
>> 2)      Interface driver queuing 10 urbs (one urb represents one trb)
>> enq ptr
>> goes to ed0d30a0.
>> 3)      Interface suspend happened. All 10 urbs are going to get
>> unlinked. For
>> the first urb to be unlinked, it is added to ep->cancelled_td_list  and
>> Stop ep cmd was called (rest of the urbs added to ep->cancelled_td_list
>> before stop ep cmd completed).
>> 4)      When stop ep cmd is completed, for 9 urbs td_to_noop() is
>> called.  For
>> the first urb (for which stop ep cmd was called),
>> xhci_find_new_dequeue_state() gets called which updates deq ptr to next
>> trb (0x ed0d3010) so set tr deq ptr cmd is issued to xhci to set its deq
>> ptr to next trb(0xed0d3010).
>> 5)      In set tr deq cmd completion handler
>> update_ring_for_set_deq_completion() gets called which will just
>> increments num_trbs_free to only one, but we unlinked all the urbs. So
>> when interface suspend finished we have enq ptr pointing to 0x ed0d30a0
>> and dep ptr pointing to 0x ed0d3010. With num_trbs_free only incremented
>> by one.
>> 6)      At the time of interface resume we re-queue 10 urbs back with
>> num_trbs_free (just incremented 1 during suspend). This means
>> num_trbs_free is incrementing very slowly every time interface suspend
>> happens. And it is very easy to get into a situation where
>> room_on_ring()
>> returning 0 as deq ptr will be reaching to the end of the seg and
>> causing
>> num_trbs_in_deq_seg to go high enough that ring->num_trbs_free <
>> num_trbs
>> + num_trbs_in_deq_seg condition becomes true.
>>
>
> So it's because the suspend/resume loops and urbs are enqueued and
> unlinked, deq_ptr does not follow enq_ptr. This is expected since
> deq_ptr and num_trbs_free increase in inc_deq(), which is called by
> irq handler. As there is no transfer on this ring, deq_ptr will not
> get updated.
Considering the suspend resume scenario, with this behavior don't you
think ring expansion will be triggered very often and for no reason (as we
are just doing suspend and resume).

In handle_stopped_endpoint() when traversing to ep->cancelled_td_list is
it better to call xhci_find_new_dequeue_state() for last td and call
td_to_noop() for all previous tds. Doing that we will update deq ptr to
set to enq ptr with correct value of num_trbs_free when interface suspends
finishes.
>
> I'm still curious about the failure of ring expansion. 128 segments
> occupy 128KB, which should not be a big deal. Ring expansion is
> expected to run until there is no memory to allocate. Does it fail
> because of allocation failure?
yes, xhci_ring_expansion() returns -ENOMEM. There are 19 rings allocated.
>
> Thanks,
> Andiry
>
>>
>>>> Even for the cases where we observe data transfer as well as suspend &
>>>> resume (which are normal scenarios) ring expansion failure is easily
>>>> observed due to large number of segments being allocated for ever.
>>>> This
>>>> means triggering of ring expansion will result in failure of further
>>>> expansion at some point of time.
>>>>
>>>> I was trying to think of some options to reduce ring expansion with
>>>> respect to suspend resume scenario: -
>>>> 1)      Can we reset enq and deq ptr to point to ep_ring first seg
>>>> once
>>>> all the
>>>> urbs are unlinked (function suspend)?
>>>> 2)      Can we free the segments at the time of suspend and allocate
>>>> segments
>>>> back when resume happens?
>>>>
>>>
>>> These does not point to the root cause of ring expansion failure.
>>>
>>> Thanks,
>>> Andiry
>>>
>>>> In general, as per the spec can we also implement shrinking of
>>>> transfer
>>>> ring (4.9.2.4).
>>>>
>>>> Please help to provide your comments on the options above (which can
>>>> easily fit in to the current design) also feel free to add better
>>>> options
>>>> to consider to avoid ring expansion failures which can be added easily
>>>> to
>>>> the current implementation.
>>>>
>>>> Currently I am using 3.10 kernel and back ported recent fixes on xhci.
>>>>
>>>> Thanks,
>>>> Hemant
>>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-usb"
>>>> in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux