>> On Wed, Nov 6, 2013 at 12:33 PM, <hemantk@xxxxxxxxxxxxxx> wrote: >>>> On Wed, Nov 6, 2013 at 12:53 AM, <hemantk@xxxxxxxxxxxxxx> wrote: >>>>> Hi >>>>> >>>>> By performing iterative port suspend and resume (which results in >>>>> function >>>>> suspend and resume), ring expansion failure is observed. Attached >>>>> device >>>>> has multiple interfaces for which interface host drivers are >>>>> unlinking >>>>> the >>>>> urbs during function suspend and submitting urbs during resume. >>>>> >>>>> For the cases when dequeue ptr is close to the end of deq segment, >>>>> value >>>>> of num_trbs_in_deq_seg (= ring->dequeue - ring->deq_seg->trbs) >>>>> becomes >>>>> high when function suspend is over. At the time of resume this high >>>>> value >>>>> is causing to trigger ring expansion in the middle of submitting >>>>> urbs(ring->num_trbs_free < num_trbs + num_trbs_in_deq_seg). All the >>>>> interface host drivers only request one TRB per urb (num_trb =1). >>>>> Ring >>>>> expansion logic is doubling the num of previously allocated segments >>>>> every >>>>> time. This is causing the num of segments to grow at fast pace (at >>>>> the >>>>> time of failure, one of the ep rings num_seg was 128), even though it >>>>> is >>>>> also increasing num_trbs_free. >>>>> >>>> >>>> Sounds like a bug. Why num_trbs_in_deq_seg becomes high? How many urbs >>>> are submitted during resume? Normally only isoc transfer triggers ring >>>> expansion with multiple trbs per urb. >>> >>> Let me explain it little further what is being observed in terms of enq >>> , >>> deq ptr:- >>> >>> For example >>> 1) Let?s say First seg of the ring starts with 0xed0d3000 which >>> means enq >>> ptr and deq ptr are pointing to this address. >>> 2) Interface driver queuing 10 urbs (one urb represents one trb) >>> enq ptr >>> goes to ed0d30a0. >>> 3) Interface suspend happened. All 10 urbs are going to get >>> unlinked. For >>> the first urb to be unlinked, it is added to ep->cancelled_td_list and >>> Stop ep cmd was called (rest of the urbs added to ep->cancelled_td_list >>> before stop ep cmd completed). >>> 4) When stop ep cmd is completed, for 9 urbs td_to_noop() is >>> called. For >>> the first urb (for which stop ep cmd was called), >>> xhci_find_new_dequeue_state() gets called which updates deq ptr to next >>> trb (0x ed0d3010) so set tr deq ptr cmd is issued to xhci to set its >>> deq >>> ptr to next trb(0xed0d3010). >>> 5) In set tr deq cmd completion handler >>> update_ring_for_set_deq_completion() gets called which will just >>> increments num_trbs_free to only one, but we unlinked all the urbs. So >>> when interface suspend finished we have enq ptr pointing to 0x ed0d30a0 >>> and dep ptr pointing to 0x ed0d3010. With num_trbs_free only >>> incremented >>> by one. >>> 6) At the time of interface resume we re-queue 10 urbs back with >>> num_trbs_free (just incremented 1 during suspend). This means >>> num_trbs_free is incrementing very slowly every time interface suspend >>> happens. And it is very easy to get into a situation where >>> room_on_ring() >>> returning 0 as deq ptr will be reaching to the end of the seg and >>> causing >>> num_trbs_in_deq_seg to go high enough that ring->num_trbs_free < >>> num_trbs >>> + num_trbs_in_deq_seg condition becomes true. >>> >> >> So it's because the suspend/resume loops and urbs are enqueued and >> unlinked, deq_ptr does not follow enq_ptr. This is expected since >> deq_ptr and num_trbs_free increase in inc_deq(), which is called by >> irq handler. As there is no transfer on this ring, deq_ptr will not >> get updated. > Considering the suspend resume scenario, with this behavior don't you > think ring expansion will be triggered very often and for no reason (as we > are just doing suspend and resume). > > In handle_stopped_endpoint() when traversing to ep->cancelled_td_list is > it better to call xhci_find_new_dequeue_state() for last td and call > td_to_noop() for all previous tds. Doing that we will update deq ptr to > set to enq ptr with correct value of num_trbs_free when interface suspends > finishes. i think instead of what suggested above, we need to somehow update num_trbs_free to the correct value any time before interface driver is going to submit the first urb as a result of interface resume. As after resume xhci is going to skip no op trbs and starts fetching the transfer trbs on the ring. >> >> I'm still curious about the failure of ring expansion. 128 segments >> occupy 128KB, which should not be a big deal. Ring expansion is >> expected to run until there is no memory to allocate. Does it fail >> because of allocation failure? > yes, xhci_ring_expansion() returns -ENOMEM. There are 19 rings allocated. >> >> Thanks, >> Andiry >> >>> >>>>> Even for the cases where we observe data transfer as well as suspend >>>>> & >>>>> resume (which are normal scenarios) ring expansion failure is easily >>>>> observed due to large number of segments being allocated for ever. >>>>> This >>>>> means triggering of ring expansion will result in failure of further >>>>> expansion at some point of time. >>>>> >>>>> I was trying to think of some options to reduce ring expansion with >>>>> respect to suspend resume scenario: - >>>>> 1) Can we reset enq and deq ptr to point to ep_ring first seg >>>>> once >>>>> all the >>>>> urbs are unlinked (function suspend)? >>>>> 2) Can we free the segments at the time of suspend and allocate >>>>> segments >>>>> back when resume happens? >>>>> >>>> >>>> These does not point to the root cause of ring expansion failure. >>>> >>>> Thanks, >>>> Andiry >>>> >>>>> In general, as per the spec can we also implement shrinking of >>>>> transfer >>>>> ring (4.9.2.4). >>>>> >>>>> Please help to provide your comments on the options above (which can >>>>> easily fit in to the current design) also feel free to add better >>>>> options >>>>> to consider to avoid ring expansion failures which can be added >>>>> easily >>>>> to >>>>> the current implementation. >>>>> >>>>> Currently I am using 3.10 kernel and back ported recent fixes on >>>>> xhci. >>>>> >>>>> Thanks, >>>>> Hemant >>>>> >>>>> >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-usb" >>>>> in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html