> From: Sarah Sharp [mailto:sarah.a.sharp@xxxxxxxxxxxxxxx] > First, Andiry already sent a patch to update the dequeue pointer in > process_isoc_td(). That just got added to the master branch a couple > days ago. So you should take that bit of code out of this patch. Will do. > Those BUG() calls were there for a reason. Does the system work without > them? Yes, it works without them, at least with the Synopsys controller. I also tested a bit with the NEC controller and didn't see any problems. According to Linus, drivers should do WARN_ON() instead of BUG() in interrupt context, unless it's absolutely impossible to continue. Sorry, I don't have a reference to that email handy. > Can you break out the fixes for uninitialized variables into a separate > patch? Will do. > Please break this into a separate patch. I see that bits 10:0 are > supposed to be the maximum packet size, which should make a bitmask of > 0x7ff, not 0x3ff. Thanks for catching that. Will do. > > @@ -131,7 +131,7 @@ static void next_trb(struct xhci_hcd *xhci, > > *seg = (*seg)->next; > > *trb = ((*seg)->trbs); > > } else { > > - *trb = (*trb)++; > > + (*trb)++; > > This part of the patch is from John's compiler fix patch, correct? It's > already been sent to Greg, but hasn't made it into the master branch yet > because Greg hasn't queued it. Yeah, I'll drop that too. By the way, that patch should probably go to -stable too, since it's undefined behavior in C, if I'm not mistaken. > > @@ -474,8 +474,11 @@ void xhci_find_new_dequeue_state(struct xhci_hcd *xhci, > > state->new_deq_seg = find_trb_seg(cur_td->start_seg, > > dev->eps[ep_index].stopped_trb, > > &state->new_cycle_state); > > - if (!state->new_deq_seg) > > - BUG(); > > + if (!state->new_deq_seg) { > > + WARN_ON(1); > > + return; > > + } > > + > > This is not a very good solution. This function is called when the > driver wants to cancel a transfer, and the host is in the middle of the > transfer. If you just return there, then the transfer will never get > taken off the hardware queue, and the hardware could potentially access > freed (or reused) memory. Or the transfer will never get cancelled. > Really bad, either way. > > I'd like to figure out why exactly find_trb_seg() is returning NULL. Do > you have any ideas? If not, I'll send you a debugging patch to get more > info. Sorry, I don't have any ideas. That part of the xhci driver is black magic to me. A debugging patch would be a good idea, I think. > > @@ -1551,6 +1556,10 @@ static int process_isoc_td(struct xhci_hcd *xhci, struct xhci_td *td, > > /* calc actual length */ > > if (ep->skip) { > > td->urb->iso_frame_desc[idx].actual_length = 0; > > + /* Update ring dequeue pointer */ > > + while (ep_ring->dequeue != td->last_trb) > > + inc_deq(xhci, ep_ring, false); > > + inc_deq(xhci, ep_ring, false); > > return finish_td(xhci, td, event_trb, event, ep, status, true); > > } > > As I said, this bit has been sent off to Greg. Will drop it. > > @@ -1839,7 +1850,6 @@ static int handle_tx_event(struct xhci_hcd *xhci, > > xhci_dbg(xhci, "td_list is empty while skip " > > "flag set. Clear skip flag.\n"); > > } > > - ret = 0; > > goto cleanup; > > } > > This bit should be in a separate patch. Will do. > > > @@ -1874,20 +1884,23 @@ static int handle_tx_event(struct xhci_hcd *xhci, > > "Skip it\n"); > > goto cleanup; > > } > > - } > > > > - /* Now update the urb's actual_length and give back to > > - * the core > > - */ > > - if (usb_endpoint_xfer_control(&td->urb->ep->desc)) > > - ret = process_ctrl_td(xhci, td, event_trb, event, ep, > > - &status); > > - else if (usb_endpoint_xfer_isoc(&td->urb->ep->desc)) > > - ret = process_isoc_td(xhci, td, event_trb, event, ep, > > - &status); > > - else > > - ret = process_bulk_intr_td(xhci, td, event_trb, event, > > - ep, &status); > > + /* Now update the urb's actual_length and give back to > > + * the core > > + */ > > + if (usb_endpoint_xfer_control(&td->urb->ep->desc)) > > + ret = process_ctrl_td(xhci, td, event_trb, event, ep, > > + &status); > > + else if (usb_endpoint_xfer_isoc(&td->urb->ep->desc)) > > + ret = process_isoc_td(xhci, td, event_trb, event, ep, > > + &status); > > + else > > + ret = process_bulk_intr_td(xhci, td, event_trb, event, > > + ep, &status); > > + } else { > > + xhci_dbg(xhci, "event_seg is NULL\n"); > > + return -ENODEV; > > + } > > I assume you did this part of the patch because handle_tx_event() was > infinite looping. I don't think it's a very good fix, and I'd rather > know the underlying cause for this. > > The fix looks a bit...odd. The only way the else statement would run is > if this statement is false: > if (!event_seg && > (!ep->skip || > !usb_endpoint_xfer_isoc(&td->urb->ep->desc))) { > > or > if (!(!event_seg && (!ep->skip || !usb_endpoint_xfer_isoc(&td->urb->ep->desc)))) > if ((event_seg || !(!ep->skip || !usb_endpoint_xfer_isoc(&td->urb->ep->desc)))) > if ((event_seg || (ep->skip && usb_endpoint_xfer_isoc(&td->urb->ep->desc)))) > > We know the event_seg is NULL, so the skip flag was set and this was an > isochronous endpoint. I wonder if the event_seg was NULL because of the > lack of setting the dequeue pointer while processing the skipped TDs. > > Does the infinite looping go away when you apply *just* the patch to fix > updating the dequeue pointer? What about the issues with triggering the > BUG() calls while setting the TR dequeue pointer? No, Andiry's fix for the dequeue pointer didn't fix either of those. -- Paul -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html