Two NEC uPD720200 adapters have been observed to randomly misbehave: a Stop Endpoint command fails with Context Error, the Output Context indicates Stopped state, and the endpoint keeps running. Very often, Set TR Dequeue Pointer is seen to fail next with Context Error too, in addition to problems from unexpectedly completed cancelled work. The pathology is common on fast running isoc endpoints like uvcvideo, but has also been reproduced on a full-speed bulk endpoint of pl2303. It seems all EPs are affected, with risk proportional to their load. Reproduction involves receiving any kind of stream and closing it to make the device driver cancel URBs already queued in advance. Deal with it by retrying the command like in the Running state. Signed-off-by: Michal Pecio <michal.pecio@xxxxxxxxx> --- This should be my last patch for NEC eccentricities since everything is working smoothly now. I may or may not find something interesting on the VL805. It still crashes sometimes for no obvious reason, needing reboot. I thought it would be prudent to trigger this on uPD720200 only, hell knows what bugs other controllers have. Note that the NEC quirk applies to this specific chip only - its successors have vendor ID of Renesas. I feel a little dirty retrying something with no obvious stop condition; is there anything that prevents this from trying forever if things go really wrong? Same question for the "running" case. I figure a counter could be easily added for both, if necessary. I typically see 1 to 3 retries before the command succeeds. [ +0,000008] usb 9-2: Selecting alternate setting 9 (20480 B/frame bandwidth) [ +0,005639] usb 9-2: Allocated 5 URB buffers of 32x20480 bytes each [ +0,292400] xhci_hcd 0000:02:00.0: Retrying STOP on buggy NEC [ +0,000051] xhci_hcd 0000:02:00.0: Retrying STOP on buggy NEC [ +0,000109] xhci_hcd 0000:02:00.0: It worked! [ +0,000087] xhci_hcd 0000:02:00.0: Retrying STOP on buggy NEC [ +0,000047] xhci_hcd 0000:02:00.0: Retrying STOP on buggy NEC [ +0,000117] xhci_hcd 0000:02:00.0: It worked! [ +0,000040] xhci_hcd 0000:02:00.0: Retrying STOP on buggy NEC [ +0,000040] xhci_hcd 0000:02:00.0: Retrying STOP on buggy NEC [ +0,000045] xhci_hcd 0000:02:00.0: Retrying STOP on buggy NEC [ +0,000123] xhci_hcd 0000:02:00.0: It worked! drivers/usb/host/xhci-ring.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 9673354d70d5..7edd655cb6b4 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1147,6 +1147,15 @@ static void xhci_handle_cmd_stop_ep(struct xhci_hcd *xhci, int slot_id, break; ep->ep_state &= ~EP_STOP_CMD_PENDING; return; + case EP_STATE_STOPPED: + /* + * NEC uPD720200 sometimes sets this state and fails with + * Context Error while continuing to process TRBs. + * Be conservative and trust EP_CTX_STATE on other chips. + */ + if (!(xhci->quirks & XHCI_NEC_HOST)) + break; + fallthrough; case EP_STATE_RUNNING: /* Race, HW handled stop ep cmd before ep was running */ xhci_dbg(xhci, "Stop ep completion ctx error, ep is running\n"); -- 2.43.0