Re: [RFC 0/9] xHCI 1.0 patchset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2011-05-11 at 01:11 +0800, Sarah Sharp wrote:
> On Tue, May 10, 2011 at 11:36:48AM +0800, He, Alex wrote:
> > Hi Sarah,
> >
> > Thanks for your correction and suggestion.
> >
> > I am sorry for that I don't give a very clear description for FSE
> patch.
> > Let me to explain it again with a special scenario.
> >       1. There are two TDs in td_list that will be transferred.
> 
> >       2. The first TD was transferred successfully reported by an
> >       transfer event with COMP_SUCCESS. The xHCI driver called
> >       finish_td() to remove the first TD from the td_list and
> increase
> >       the ep_ring->dequeue and ep_ring->deq_seg to the next TD. But
> >       the HW internal ep ring dequeue maintained by xHC may stay at
> >       the last TRB of the first TD before start the next transfer
> for
> >       the second TD.
> 
> Ok, at first I wondered how this could ever happen, since it kind of
> sounds like a hardware bug, but it's documented in Table 3 of the xHCI
> spec, so I guess it's legitimate behavior.

I am also confused about that since the spec doesn't define the FSE
clearly. But we need to process it for some hardware implemention. 
> 
> >       3. The stop endpoint command run just after the first TD being
> >       transferred successfully. It means that an
> >       FSE(CC=COMP_STOP_INVAL) queued after the transfer
> >       event(CC=COMP_SUCCESS) for the first TD.  The FSE uses the HW
> >       internal ep ring dequeue which point to the last TRB of the
> >       first TD. And the last TRB of the first TD is the event_trb.
> >
> >       4. Then driver called handle_tx_event() again to process the
> >       FSE. It will try to find out the event_seg which include the
> >       event_trb of the FSE by the line: event_seg =
> >       trb_in_td(ep_ring->deq_seg, ep_ring->dequeue, td->last_trb,
> >       event_dma) It will fail with NULL return value because that
> the
> >       ep_ring->deq_seg and ep_ring->dequeue move to the second TD.
> >       i.e. you can't find out the last TRB of the first TD in the
> >       second TD. And this make the driver go to an wrong way with
> >       "return -ESHUTDWON".
> >
> > Another complicate case will occur when there an Link TD between the
> two
> > TDs.
> 
> Let's take a step back.  For the cancellation code to work properly,
> we
> need to know where the hardware dequeue pointer is.  From the xHCI
> driver's perspective, the previous TD is already done, and the xHCI
> driver's ring dequeue pointer has been moved past that TD (and any
> link
> TRBs that follow).
> 
> All we really care about is whether the HW's dequeue pointer is
> somewhere in the middle of the next TD, so we can decide whether to
> turn
> the TD into no-ops or use a Set TR dequeue pointer.  If the HW is
> actually stopped before the TD, it would be fine to turn the TD into
> no-ops, right?

Yes.
> 
> What happens if we just ignore the FSE event for the last TRB of the
> first TD?  ep->stopped_td should be NULL after the last Stop Endpoint
> command was handled.  When we're deciding whether to turn the canceled
> TD into no-ops or issue a Set TR dequeue pointer, we'll compare that
> TD
> to ep->stopped_td, which won't match, and just turn that TD into
> no-ops.
> That behavior should be fine, since the xHC is supposed to clear any
> cached TRBs when a Stop Endpoint command is issued, and the xHC will
> just go on to process the no-op TRBs.
> 
> I think it might be cleaner to just ignore the FSE, by checking for
> !event_seg && comp_code == COMP_STOP_INVAL, rather than keeping track
> of
> the previous TD.  This also avoids a bad pointer dereference with this
> code if the previous TD has already been freed:
> 
> +               if (!event_seg) {
> +                       event_seg = trb_in_td(prev_td->start_seg,
> +                                       prev_td->first_trb,
> +                                       td->first_trb,
> +                                       event_dma);
> +                       if (event_seg && (trb_comp_code ==
> COMP_STOP_INVAL))
> +                               if (is_force_stopeed_event(event_seg,
> +
> prev_td->last_trb,
> +                                                       event_dma)) {
> +                                       ret = 0;
> +                                       goto cleanup;
> +                               }
> 
> You dereference prev_td in the call to is_force_stopeed_event().
> That's
> bad because it might have been freed when the successful event TD for
> the first TD was processed.  I don't think you should try to prevent
> the
> xhci_td from being freed when an URB is given back, so I think the
> simpler method is to just ignore the FSE event.
> I'm also not certain there is a bug, but the next bit of code also
> checks for a null event_seg, and I wonder what the interaction between
> the two will be:
> 
>                 if (!event_seg) {
>                         if (!ep->skip ||
>                             !
> usb_endpoint_xfer_isoc(&td->urb->ep->desc)) {
>                                 /* HC is busted, give up! */
>                                 xhci_err(xhci,
>                                         "ERROR Transfer event TRB DMA
> ptr not "
>                                         "part of current TD\n");
>                                 return -ESHUTDOWN;
>                         }
> 
>                         ret = skip_isoc_td(xhci, td, event, ep,
> &status);
>                         goto cleanup;
>                 }
> 
> It's probably better if you check for both conditions (!event_seg &&
> comp_code == COMP_STOP_INVAL) before this block.
> 
I'd thought of this conditions too but not sure if it is enough for FSE.
That is one reason for me to chose the prev_td do that which introduce
the crisis as you said. By comparison, the method of your is the better
one.

> > So I think we need an patch for these cases like that:
> >       1. Record the TD as prev_td in ep_ring before delete it from
> the
> > td_list.
> 
> I don't think this is the right approach, because the xhci_td may get
> freed.  You could prevent it from being freed until the next TD gets
> completed, but then you'll have to add extra checking to the bandwidth
> code to make sure and delete the last recorded TD.  That way seems
> more
> complicated and prone to introducing bugs.
> 
> >       2. Find the FSE event_trb in the prev_td or between the
> prev_td
> > and cur_td
> >          for it maybe a link TD.
> >       3. Just jump over the FSE in the event ring because the prev
> > transfer event
> >          have been processed.
> >
> > That's all my think about the FSE. It's too hard to figure out
> them. :(
> 
> Thank you for the explanation, that's much clearer than the patch
> description.  I would suggest you put something like it in the next
> patch description.  Don't be afraid to send really long descriptions
> if
> it's a really complicated bug, even if the patch description turns out
> longer than the fix. :)

One more question.
Will the second TD be added to the cancelled_td_list for the FSE? 
If it is not true, when the driver call handle_stopped_endpoint(), the
condition (cur_td == ep->stopped_td) is false. The
xhci_find_new_dequeue_state() won't be called and the
deq_state.new_deq_ptr and deq_state_new_deq_seg are still zero. Then the
ring_doorbell_for_active_rings() be called and the ep_ring run again. In
other words the stop endpoint command can't stop the ep for the FSE
case.


Thanks,
Alex He

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux