Hi, Michal Pecio <michal.pecio@xxxxxxxxx> 於 2025年2月11日 週二 下午8:36寫道: > > xHCI 4.9.1 requires HCs to obey the IOC flag we set on the last TRB even > after an error has been reported on an earlier TRB. This typically means > that an error mid TD is followed by a success event for the last TRB. > > On SuperSpeed (and only SS) isochronous endpoints Etron hosts were found > to emit a success event also after an error on the last TRB of a TD. > > Reuse the machinery for handling errors mid TD to handle these otherwise > unexpected events. Avoid printing "TRB not part of current TD" errors, > ensure proper tracking of HC's internal dequeue pointer and distinguish > this known quirk from other bogus events caused by ordinary bugs. > > This patch was found to eliminate all related warnings and errors while > running for 30 minutes with a UVC camera on a flaky cable which produces > transaction errors about every second. An altsetting was chosen which > causes some TDs to be multi-TRB, dynamic debug was used to confirm that > errors both mid TD and on the last TRB are handled as expected: > > [ 6028.439776] xhci_hcd 0000:06:00.0: Transfer error for slot 1 ep 2 on endpoint > [ 6028.439784] xhci_hcd 0000:06:00.0: Error 4 mid isoc TD, wait for final completion event, is_last_trb=1 > [ 6028.440268] xhci_hcd 0000:06:00.0: Successful completion on short TX for slot 1 ep 2 with last td short 0 > [ 6028.440270] xhci_hcd 0000:06:00.0: Got event 1 after mid TD error > [ 6029.123683] xhci_hcd 0000:06:00.0: Transfer error for slot 1 ep 2 on endpoint > [ 6029.123694] xhci_hcd 0000:06:00.0: Error 4 mid isoc TD, wait for final completion event, is_last_trb=0 > [ 6029.123697] xhci_hcd 0000:06:00.0: Successful completion on short TX for slot 1 ep 2 with last td short 0 > [ 6029.123700] xhci_hcd 0000:06:00.0: Got event 1 after mid TD error > > Handling of Stopped events is unaffected: finish_td() is called but it > does nothing and the TD waits until it's unlinked: > > [ 7081.705544] xhci_hcd 0000:06:00.0: Transfer error for slot 1 ep 2 on endpoint > [ 7081.705546] xhci_hcd 0000:06:00.0: Error 4 mid isoc TD, wait for final completion event, is_last_trb=1 > [ 7081.705630] xhci_hcd 0000:06:00.0: Stopped on Transfer TRB for slot 1 ep 2 > [ 7081.705633] xhci_hcd 0000:06:00.0: Got event 26 after mid TD error > [ 7081.705678] xhci_hcd 0000:06:00.0: Stopped on Transfer TRB for slot 1 ep 2 > [ 7081.705680] xhci_hcd 0000:06:00.0: Got event 26 after mid TD error > [ 7081.705759] xhci_hcd 0000:06:00.0: Stopped on No-op or Link TRB for slot 1 ep 2 > [ 7081.705799] xhci_hcd 0000:06:00.0: Stopped on No-op or Link TRB for slot 1 ep 2 > > Reported-by: Kuangyi Chiang <ki.chiang65@xxxxxxxxx> > Closes: https://lore.kernel.org/linux-usb/20250205053750.28251-1-ki.chiang65@xxxxxxxxx/T/ > Signed-off-by: Michal Pecio <michal.pecio@xxxxxxxxx> > --- > > > > Hi Mathias, > > This is the best I was able to do. It does add a few lines, but I don't > think it's too scary and IMO the switch looks even better this way. It > accurately predicts those events while not breaking anything else that > I can see or think of, save for the risk of firmware bugfix adding one > ESIT of latency on errors. > > I tried to also test your Etron patch but it has whitespace damage all > over the place and would be hard to apply. > > Regards, > Michal > > > drivers/usb/host/xhci-ring.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 965bffce301e..7ff5075e5890 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -2437,6 +2437,7 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, > bool sum_trbs_for_length = false; > u32 remaining, requested, ep_trb_len; > int short_framestatus; > + bool error_event = false, etron_quirk = false; > > trb_comp_code = GET_COMP_CODE(le32_to_cpu(event->transfer_len)); > urb_priv = td->urb->hcpriv; > @@ -2473,8 +2474,7 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, > fallthrough; > case COMP_ISOCH_BUFFER_OVERRUN: > frame->status = -EOVERFLOW; > - if (ep_trb != td->end_trb) > - td->error_mid_td = true; > + error_event = true; > break; > case COMP_INCOMPATIBLE_DEVICE_ERROR: > case COMP_STALL_ERROR: > @@ -2483,8 +2483,7 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, > case COMP_USB_TRANSACTION_ERROR: > frame->status = -EPROTO; > sum_trbs_for_length = true; > - if (ep_trb != td->end_trb) > - td->error_mid_td = true; > + error_event = true; > break; > case COMP_STOPPED: > sum_trbs_for_length = true; > @@ -2518,8 +2517,17 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, > td->urb->actual_length += frame->actual_length; > > finish_td: > + /* An error event mid TD will be followed by more events, xHCI 4.9.1 */ > + td->error_mid_td |= error_event && (ep_trb != td->end_trb); > + > + /* Etron treats *all* SuperSpeed isoc errors like errors mid TD */ > + if (xhci->quirks & XHCI_ETRON_HOST && td->urb->dev->speed == USB_SPEED_SUPER) { > + td->error_mid_td |= error_event; > + etron_quirk |= error_event; This would be the same as etron_quirk = error_event; right? > + } > + > /* Don't give back TD yet if we encountered an error mid TD */ > - if (td->error_mid_td && ep_trb != td->end_trb) { > + if (td->error_mid_td && (ep_trb != td->end_trb || etron_quirk)) { > xhci_dbg(xhci, "Error mid isoc TD, wait for final completion event\n"); > td->urb_length_set = true; > return; > -- > 2.48.1 I tested this with Etron EJ168 and EJ188 under Linux-6.13.1. It works. Thanks, Kuangyi Chiang