Re: XHCI: Handling of Zero length packet in data stage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry for the very delayed response on this, I'm trying to catch up on
my email from around Kernel Summit and my vacation.

On Tue, Oct 08, 2013 at 10:19:11AM -0400, Alan Stern wrote:
> On Mon, 7 Oct 2013, Sarah Sharp wrote:
> 
> > > i am using 3.10 kernel. Also i looked at tip i see same implementation for
> > > process_ctrl_td()
> > > 
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/host/xhci-ring.c
> > 
> > Yeah, the implementation has always been this way.  I've known since the
> > beginning of writing the xHCI driver that it needs to handle zero-length
> > data phases, but hadn't had ever had a report that it was required by a
> > device under Linux.
> 
> This code, near the end of process_ctrl_td(), looks a little strange:

I really hate the xHCI ring code. :(  I would rip it out and totally
start over, but I'm afraid of introducing more low-level bugs than we
already have.

The problem with process_ctrl_td() is that it's trying to process events
for all three phases of the control transfer with the same code, and
it's a mess.  Even I have to sit there with with the xHCI spec and map
out all the different ways the code handles error conditions.

> 	/*
> 	 * Did we transfer any data, despite the errors that might have
> 	 * happened?  I.e. did we get past the setup stage?
> 	 */
> 	if (event_trb != ep_ring->dequeue) {
> 		/* The event was for the status stage */
> 		if (event_trb == td->last_trb) {
> 			if (td->urb->actual_length != 0) {
> 				/* Don't overwrite a previously set error code
> 				 */
> 				if ((*status == -EINPROGRESS || *status == 0) &&
> 						(td->urb->transfer_flags
> 						 & URB_SHORT_NOT_OK))
> 					/* Did we already see a short data
> 					 * stage? */
> 					*status = -EREMOTEIO;
> 
> If you already saw a short data stage, why isn't the status already set
> to -EREMOTEIO?
>
> Also what's the reason for the test "td->urb->actual_length != 0"?  
> What does this have to do with getting a short data stage?  Are you 
> assuming that at this point, actual_length will be nonzero if and only 
> if there was a short data stage?
> 
> 			} else {
> 				td->urb->actual_length =
> 					td->urb->transfer_buffer_length;
> 			}
> 
> What's the purpose of this clause?
>
> It looks like the driver is confusing "actual_length == 0" with 
> "actual_length was never set".  What if actual_length _was_ previously 
> set, but it was set to 0?

Right, I understand what you're saying now.  The code is trying to
handle two separate cases in one function:

1. The setup, data, and status phases all complete successfully.  The
xHC only sends a successful completion event for the status phase.

2. The setup phase completes successfully, the data phase is short, and
the status phase is successful.  The xHC will generate a short completion
event for the short data phase, and a successful completion event for
the status phase.

Here's the code:

        switch (trb_comp_code) {
        case COMP_SUCCESS:
                if (event_trb == ep_ring->dequeue) {
                        xhci_warn(xhci, "WARN: Success on ctrl setup TRB "
                                        "without IOC set??\n");
                        *status = -ESHUTDOWN;
                } else if (event_trb != td->last_trb) {
                        xhci_warn(xhci, "WARN: Success on ctrl data TRB "
                                        "without IOC set??\n");
                        *status = -ESHUTDOWN;
                } else {
                        *status = 0;
                }
                break;
        case COMP_SHORT_TX:
                if (td->urb->transfer_flags & URB_SHORT_NOT_OK)
                        *status = -EREMOTEIO;
                else
                        *status = 0;
                break;

...
        if (event_trb != ep_ring->dequeue) {
                /* The event was for the status stage */
                if (event_trb == td->last_trb) {
                        if (td->urb->actual_length != 0) {
                                /* Don't overwrite a previously set error code
                                 */
                                if ((*status == -EINPROGRESS || *status == 0) &&
                                                (td->urb->transfer_flags
                                                 & URB_SHORT_NOT_OK))
                                        /* Did we already see a short data
                                         * stage? */
                                        *status = -EREMOTEIO;
                        } else {
                                td->urb->actual_length =
                                        td->urb->transfer_buffer_length;
                        }
                } else {
                /* Maybe the event was for the data stage? */
                        td->urb->actual_length =
                                td->urb->transfer_buffer_length -
                                EVENT_TRB_LEN(le32_to_cpu(event->transfer_len));
                        xhci_dbg(xhci, "Waiting for status "
                                        "stage event\n");
                        return 0;
                }
        }

For the simple case (successful event on status phase only), *status gets
set to zero in the switch statement.  We fall through to the if, and see that
the event was for the status stage.  We fall through to the else statement in
this chunk of code, since td->urb->actual_length isn't set:

                        if (td->urb->actual_length != 0) {
                                /* Don't overwrite a previously set error code
                                 */
                                if ((*status == -EINPROGRESS || *status == 0) &&
                                                (td->urb->transfer_flags
                                                 & URB_SHORT_NOT_OK))
                                        /* Did we already see a short data
                                         * stage? */
                                        *status = -EREMOTEIO;
                        } else {
                                td->urb->actual_length =
                                        td->urb->transfer_buffer_length;
                        }

For case #2, on the short data phase, we set *status = -EREMOTEIO in the
switch statement, but it has no impact on the overall code, because it's
only used in the caller if the urb is being completed.  Then we fall
through to the data stage else statement and set urb->actual_length.

Then we get the successful event for the status stage, which
unconditionally overwrites urb->status to zero in the switch statement.
We hit the status stage block in the lower code, and to differentiate
between case #1 and case #2, we check if urb->actual_length is set.
That obviously doesn't work if the data phase transfered zero bytes.

So, yes, your analysis was correct.  I just needed to stare at the code
to get the state into my head.

I guess the biggest confusion I had when I wrote this code was: what
should the URB status and actual_length be if the data phase transferred
successfully, but an error occurred on the status phase?  I don't think
the current code handles that very well for 0.95 xHCI hosts.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux