On Wed, Nov 6, 2013 at 12:33 PM, <hemantk@xxxxxxxxxxxxxx> wrote: >> On Wed, Nov 6, 2013 at 12:53 AM, <hemantk@xxxxxxxxxxxxxx> wrote: >>> Hi >>> >>> By performing iterative port suspend and resume (which results in >>> function >>> suspend and resume), ring expansion failure is observed. Attached device >>> has multiple interfaces for which interface host drivers are unlinking >>> the >>> urbs during function suspend and submitting urbs during resume. >>> >>> For the cases when dequeue ptr is close to the end of deq segment, value >>> of num_trbs_in_deq_seg (= ring->dequeue - ring->deq_seg->trbs) becomes >>> high when function suspend is over. At the time of resume this high >>> value >>> is causing to trigger ring expansion in the middle of submitting >>> urbs(ring->num_trbs_free < num_trbs + num_trbs_in_deq_seg). All the >>> interface host drivers only request one TRB per urb (num_trb =1). Ring >>> expansion logic is doubling the num of previously allocated segments >>> every >>> time. This is causing the num of segments to grow at fast pace (at the >>> time of failure, one of the ep rings num_seg was 128), even though it is >>> also increasing num_trbs_free. >>> >> >> Sounds like a bug. Why num_trbs_in_deq_seg becomes high? How many urbs >> are submitted during resume? Normally only isoc transfer triggers ring >> expansion with multiple trbs per urb. > > Let me explain it little further what is being observed in terms of enq , > deq ptr:- > > For example > 1) Let’s say First seg of the ring starts with 0xed0d3000 which means enq > ptr and deq ptr are pointing to this address. > 2) Interface driver queuing 10 urbs (one urb represents one trb) enq ptr > goes to ed0d30a0. > 3) Interface suspend happened. All 10 urbs are going to get unlinked. For > the first urb to be unlinked, it is added to ep->cancelled_td_list and > Stop ep cmd was called (rest of the urbs added to ep->cancelled_td_list > before stop ep cmd completed). > 4) When stop ep cmd is completed, for 9 urbs td_to_noop() is called. For > the first urb (for which stop ep cmd was called), > xhci_find_new_dequeue_state() gets called which updates deq ptr to next > trb (0x ed0d3010) so set tr deq ptr cmd is issued to xhci to set its deq > ptr to next trb(0xed0d3010). > 5) In set tr deq cmd completion handler > update_ring_for_set_deq_completion() gets called which will just > increments num_trbs_free to only one, but we unlinked all the urbs. So > when interface suspend finished we have enq ptr pointing to 0x ed0d30a0 > and dep ptr pointing to 0x ed0d3010. With num_trbs_free only incremented > by one. > 6) At the time of interface resume we re-queue 10 urbs back with > num_trbs_free (just incremented 1 during suspend). This means > num_trbs_free is incrementing very slowly every time interface suspend > happens. And it is very easy to get into a situation where room_on_ring() > returning 0 as deq ptr will be reaching to the end of the seg and causing > num_trbs_in_deq_seg to go high enough that ring->num_trbs_free < num_trbs > + num_trbs_in_deq_seg condition becomes true. > So it's because the suspend/resume loops and urbs are enqueued and unlinked, deq_ptr does not follow enq_ptr. This is expected since deq_ptr and num_trbs_free increase in inc_deq(), which is called by irq handler. As there is no transfer on this ring, deq_ptr will not get updated. I'm still curious about the failure of ring expansion. 128 segments occupy 128KB, which should not be a big deal. Ring expansion is expected to run until there is no memory to allocate. Does it fail because of allocation failure? Thanks, Andiry > >>> Even for the cases where we observe data transfer as well as suspend & >>> resume (which are normal scenarios) ring expansion failure is easily >>> observed due to large number of segments being allocated for ever. This >>> means triggering of ring expansion will result in failure of further >>> expansion at some point of time. >>> >>> I was trying to think of some options to reduce ring expansion with >>> respect to suspend resume scenario: - >>> 1) Can we reset enq and deq ptr to point to ep_ring first seg once >>> all the >>> urbs are unlinked (function suspend)? >>> 2) Can we free the segments at the time of suspend and allocate >>> segments >>> back when resume happens? >>> >> >> These does not point to the root cause of ring expansion failure. >> >> Thanks, >> Andiry >> >>> In general, as per the spec can we also implement shrinking of transfer >>> ring (4.9.2.4). >>> >>> Please help to provide your comments on the options above (which can >>> easily fit in to the current design) also feel free to add better >>> options >>> to consider to avoid ring expansion failures which can be added easily >>> to >>> the current implementation. >>> >>> Currently I am using 3.10 kernel and back ported recent fixes on xhci. >>> >>> Thanks, >>> Hemant >>> >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-usb" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html