Re: [RFC 0/9] xHCI 1.0 patchset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 10, 2011 at 11:36:48AM +0800, He, Alex wrote:
> Hi Sarah,
> 
> Thanks for your correction and suggestion.
> 
> I am sorry for that I don't give a very clear description for FSE patch.
> Let me to explain it again with a special scenario.
> 	1. There are two TDs in td_list that will be transferred.

> 	2. The first TD was transferred successfully reported by an
> 	transfer event with COMP_SUCCESS. The xHCI driver called
> 	finish_td() to remove the first TD from the td_list and increase
> 	the ep_ring->dequeue and ep_ring->deq_seg to the next TD. But
> 	the HW internal ep ring dequeue maintained by xHC may stay at
> 	the last TRB of the first TD before start the next transfer for
> 	the second TD.

Ok, at first I wondered how this could ever happen, since it kind of
sounds like a hardware bug, but it's documented in Table 3 of the xHCI
spec, so I guess it's legitimate behavior.

> 	3. The stop endpoint command run just after the first TD being
> 	transferred successfully. It means that an
> 	FSE(CC=COMP_STOP_INVAL) queued after the transfer
> 	event(CC=COMP_SUCCESS) for the first TD.  The FSE uses the HW
> 	internal ep ring dequeue which point to the last TRB of the
> 	first TD. And the last TRB of the first TD is the event_trb.
>
> 	4. Then driver called handle_tx_event() again to process the
> 	FSE. It will try to find out the event_seg which include the
> 	event_trb of the FSE by the line: event_seg =
> 	trb_in_td(ep_ring->deq_seg, ep_ring->dequeue, td->last_trb,
> 	event_dma) It will fail with NULL return value because that the
> 	ep_ring->deq_seg and ep_ring->dequeue move to the second TD.
> 	i.e. you can't find out the last TRB of the first TD in the
> 	second TD. And this make the driver go to an wrong way with
> 	"return -ESHUTDWON".
>
> Another complicate case will occur when there an Link TD between the two
> TDs.

Let's take a step back.  For the cancellation code to work properly, we
need to know where the hardware dequeue pointer is.  From the xHCI
driver's perspective, the previous TD is already done, and the xHCI
driver's ring dequeue pointer has been moved past that TD (and any link
TRBs that follow).

All we really care about is whether the HW's dequeue pointer is
somewhere in the middle of the next TD, so we can decide whether to turn
the TD into no-ops or use a Set TR dequeue pointer.  If the HW is
actually stopped before the TD, it would be fine to turn the TD into
no-ops, right?

What happens if we just ignore the FSE event for the last TRB of the
first TD?  ep->stopped_td should be NULL after the last Stop Endpoint
command was handled.  When we're deciding whether to turn the canceled
TD into no-ops or issue a Set TR dequeue pointer, we'll compare that TD
to ep->stopped_td, which won't match, and just turn that TD into no-ops.
That behavior should be fine, since the xHC is supposed to clear any
cached TRBs when a Stop Endpoint command is issued, and the xHC will
just go on to process the no-op TRBs.

I think it might be cleaner to just ignore the FSE, by checking for
!event_seg && comp_code == COMP_STOP_INVAL, rather than keeping track of
the previous TD.  This also avoids a bad pointer dereference with this
code if the previous TD has already been freed:

+               if (!event_seg) {
+                       event_seg = trb_in_td(prev_td->start_seg,
+                                       prev_td->first_trb,
+                                       td->first_trb,
+                                       event_dma);
+                       if (event_seg && (trb_comp_code == COMP_STOP_INVAL))
+                               if (is_force_stopeed_event(event_seg,
+                                                       prev_td->last_trb,
+                                                       event_dma)) {
+                                       ret = 0;
+                                       goto cleanup;
+                               }

You dereference prev_td in the call to is_force_stopeed_event().  That's
bad because it might have been freed when the successful event TD for
the first TD was processed.  I don't think you should try to prevent the
xhci_td from being freed when an URB is given back, so I think the
simpler method is to just ignore the FSE event.

I'm also not certain there is a bug, but the next bit of code also
checks for a null event_seg, and I wonder what the interaction between
the two will be:

		if (!event_seg) {
                        if (!ep->skip ||
                            !usb_endpoint_xfer_isoc(&td->urb->ep->desc)) {
                                /* HC is busted, give up! */
                                xhci_err(xhci,
                                        "ERROR Transfer event TRB DMA ptr not "
                                        "part of current TD\n");
                                return -ESHUTDOWN;
                        }

                        ret = skip_isoc_td(xhci, td, event, ep, &status);
                        goto cleanup;
                }

It's probably better if you check for both conditions (!event_seg &&
comp_code == COMP_STOP_INVAL) before this block.


> So I think we need an patch for these cases like that:
> 	1. Record the TD as prev_td in ep_ring before delete it from the
> td_list.

I don't think this is the right approach, because the xhci_td may get
freed.  You could prevent it from being freed until the next TD gets
completed, but then you'll have to add extra checking to the bandwidth
code to make sure and delete the last recorded TD.  That way seems more
complicated and prone to introducing bugs.

> 	2. Find the FSE event_trb in the prev_td or between the prev_td
> and cur_td
> 	   for it maybe a link TD.
> 	3. Just jump over the FSE in the event ring because the prev
> transfer event
> 	   have been processed.
> 
> That's all my think about the FSE. It's too hard to figure out them. :(

Thank you for the explanation, that's much clearer than the patch
description.  I would suggest you put something like it in the next
patch description.  Don't be afraid to send really long descriptions if
it's a really complicated bug, even if the patch description turns out
longer than the fix. :)

> BTW: I will refine the other patches and commit them later. 

Sure, thank you.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux