On Fri, 6 Dec 2013, Sarah Sharp wrote: > I really hate the xHCI ring code. :( I would rip it out and totally > start over, but I'm afraid of introducing more low-level bugs than we > already have. I haven't paid close attention to it. The impression I occasionally get is that significant simplifications may be possible. > The problem with process_ctrl_td() is that it's trying to process events > for all three phases of the control transfer with the same code, and > it's a mess. Even I have to sit there with with the xHCI spec and map > out all the different ways the code handles error conditions. It shouldn't be a mess. The general rules are very straightforward: Data from a SETUP TRB doesn't contribute toward urb->actual_length. Data from other TRBs does. If an IN TRB is short and URB_SHORT_NOT_OK is clear, continue on to the next TRB in the usual manner. For any other unexpected outcome, end the transfer immediately with an appropriate error code and stop the endpoint ring. I realize that xHCI doesn't give you any way to stop the endpoint ring when a TRB is short. In such cases you'll just have to wait until you see the completion event for the final TRB, and leave the ring running. > Right, I understand what you're saying now. The code is trying to > handle two separate cases in one function: > > 1. The setup, data, and status phases all complete successfully. The > xHC only sends a successful completion event for the status phase. Okay, that should be easy to handle. > 2. The setup phase completes successfully, the data phase is short, and > the status phase is successful. The xHC will generate a short completion > event for the short data phase, and a successful completion event for > the status phase. In this case, you should set urb->actual_length in the usual way. You can even set the status to 0; __usb_hcd_giveback_urb() will change the status to -EREMOTEIO for you if URB_SHORT_NOT_OK is set. The upshot is: You can treat this case the same as the previous case. > I guess the biggest confusion I had when I wrote this code was: what > should the URB status and actual_length be if the data phase transferred > successfully, but an error occurred on the status phase? I don't think > the current code handles that very well for 0.95 xHCI hosts. urb->actual_length should _always_ indicate the number of data bytes transferred. That's true regardless of whether or not an error occurs. If the setup and data stages are successful but an error occurs during the status stage, then the URB status should reflect this error. The only time urb->status is 0 is when there are no errors. (Note that a short IN data transfer counts as an error if and only if URB_SHORT_NOT_OK is set.) > I do finally understand the issue, after staring at the spaghetti code > for a couple days. There's also a couple other corner cases that don't > look like they're covered: > > Case 1: > - Successful data OUT phase > - Short status IN phase > - In this case, if the upper layer doesn't set the URB_SHORT_NOT_OK > flag, urb->status would be set to 0, urb->actual_length = > transfer_buffer_length > - End result: upper layer thinks the control OUT completed fine, when > it didn't. > > Alan, I'm actually not sure what error status should be returned in this > case. Should it be -EREMOTEIO or -EPROTO? This can never happen, since the status stage always has length 0. You can't be shorter than 0 bytes! :-) > I'm actually not convinced we can even get a short completion code for a > status TRB. The xHCI spec says "The xHC shall wait indefinitely for a > Success, Stall Error or other error response from device for the Status > stage." I'm not sure if a short completion code is included in that > "other error response" phrase. > > Case 2: > - Babble on data IN phase > - 0.95 xHCI host does not halt endpoint (the 0.96 spec errata clarified > that endpoints should be halted on a babble error) Sounds like a genuine bug in 0.95. Babble should always stop the endpoint ring. But I guess you can't do anything about it. > - At this point, I'm not sure what completion code the status OUT phase > will finish with. If it's successful, the babble status is not > passed up in urb->status. This is one situation where you will need to remember the error. When the status stage ends, you must return the -EOVERFLOW error code remembered from the data stage. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html