Hi, Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> 於 2025年2月5日 週三 下午11:16寫道: > > On 5.2.2025 16.17, Mathias Nyman wrote: > > On 5.2.2025 7.37, Kuangyi Chiang wrote: > >> Unplugging a USB3.0 webcam while streaming results in errors like this: > >> > >> [ 132.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 > >> [ 132.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0 > >> [ 132.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 > >> [ 132.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0 > >> > >> If an error is detected while processing the last TRB of an isoc TD, > >> the Etron xHC generates two transfer events for the TRB where the > >> error was detected. The first event can be any sort of error (like > >> USB Transaction or Babble Detected, etc), and the final event is > >> Success. > >> > >> The xHCI driver will handle the TD after the first event and remove it > >> from its internal list, and then print an "Transfer event TRB DMA ptr > >> not part of current TD" error message after the final event. > >> > >> Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a > >> transaction error mid TD.") is designed to address isoc transaction > >> errors, but unfortunately it doesn't account for this scenario. > >> > >> To work around this by reusing the logic that handles isoc transaction > >> errors, but continuing to wait for the final event when this condition > >> occurs. Sometimes we see the Stopped event after an error mid TD, this > >> is a normal event for a pending TD and we can think of it as the final > >> event we are waiting for. > > > > Not giving back the TD when we get an event for the last TRB in the > > TD sounds risky. With this change we assume all old and future ETRON hosts > > will trigger this additional spurious success event. > > > > I think we could handle this more like the XHCI_SPURIOUS_SUCCESS case seen > > with short transfers, and just silence the error message. > > > > Are there any other issues besides the error message seen? > > > > Would something like this work: (untested, compiles) > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 965bffce301e..81d45ddebace 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -2644,6 +2644,22 @@ static int handle_transferless_tx_event(struct xhci_hcd *xhci, struct xhci_virt_ > return 0; > } > > +static bool xhci_spurious_success_tx_event(struct xhci_hcd *xhci, > + struct xhci_ring *ring) > +{ > + switch(ring->last_td_comp_code) { > + case COMP_SHORT_PACKET: > + return xhci->quirks & XHCI_SPURIOUS_SUCCESS; > + case COMP_USB_TRANSACTION_ERROR: > + case COMP_BABBLE_DETECTED_ERROR: > + case COMP_ISOCH_BUFFER_OVERRUN: > + return xhci->quirks & XHCI_ETRON_HOST && > + ring->type == TYPE_ISOC; > + default: > + return false; > + } > +} > + > /* > * If this function returns an error condition, it means it got a Transfer > * event with a corrupted Slot ID, Endpoint ID, or TRB DMA address. > @@ -2697,8 +2713,8 @@ static int handle_tx_event(struct xhci_hcd *xhci, > case COMP_SUCCESS: > if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) { > trb_comp_code = COMP_SHORT_PACKET; > - xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td short %d\n", > - slot_id, ep_index, ep_ring->last_td_was_short); > + xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td comp code %d\n", > + slot_id, ep_index, ep_ring->last_td_comp_code); > } > break; > case COMP_SHORT_PACKET: > @@ -2846,7 +2862,7 @@ static int handle_tx_event(struct xhci_hcd *xhci, > */ > if (trb_comp_code != COMP_STOPPED && > trb_comp_code != COMP_STOPPED_LENGTH_INVALID && > - !ep_ring->last_td_was_short) { > + !xhci_spurious_success_tx_event(xhci, ep_ring)) { > xhci_warn(xhci, "Event TRB for slot %u ep %u with no TDs queued\n", > slot_id, ep_index); > } > @@ -2890,11 +2906,10 @@ static int handle_tx_event(struct xhci_hcd *xhci, > > /* > * Some hosts give a spurious success event after a short > - * transfer. Ignore it. > + * transfer or error on last TRB. Ignore it. > */ > - if ((xhci->quirks & XHCI_SPURIOUS_SUCCESS) && > - ep_ring->last_td_was_short) { > - ep_ring->last_td_was_short = false; > + if (xhci_spurious_success_tx_event(xhci, ep_ring)) { > + ep_ring->last_td_comp_code = trb_comp_code; > return 0; > } > > @@ -2922,10 +2937,7 @@ static int handle_tx_event(struct xhci_hcd *xhci, > */ > } while (ep->skip); > > - if (trb_comp_code == COMP_SHORT_PACKET) > - ep_ring->last_td_was_short = true; > - else > - ep_ring->last_td_was_short = false; > + ep_ring->last_td_comp_code = trb_comp_code; > > ep_trb = &ep_seg->trbs[(ep_trb_dma - ep_seg->dma) / sizeof(*ep_trb)]; > trace_xhci_handle_transfer(ep_ring, (struct xhci_generic_trb *) ep_trb, ep_trb_dma); > diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h > index 779b01dee068..acc8d3c7a199 100644 > --- a/drivers/usb/host/xhci.h > +++ b/drivers/usb/host/xhci.h > @@ -1371,7 +1371,7 @@ struct xhci_ring { > unsigned int num_trbs_free; /* used only by xhci DbC */ > unsigned int bounce_buf_len; > enum xhci_ring_type type; > - bool last_td_was_short; > + u32 last_td_comp_code; > struct radix_tree_root *trb_address_map; > }; > It looks better than my patch v2, I will test it later. Thanks, Kuangyi Chiang