Re: [PATCH v4 1/1] xhci: Correctly handle last TRB of isoc TD on Etron xHCI host

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> 於 2025年2月5日 週三 下午11:16寫道:
>
> On 5.2.2025 16.17, Mathias Nyman wrote:
> > On 5.2.2025 7.37, Kuangyi Chiang wrote:
> >> Unplugging a USB3.0 webcam while streaming results in errors like this:
> >>
> >> [ 132.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
> >> [ 132.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0
> >> [ 132.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
> >> [ 132.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0
> >>
> >> If an error is detected while processing the last TRB of an isoc TD,
> >> the Etron xHC generates two transfer events for the TRB where the
> >> error was detected. The first event can be any sort of error (like
> >> USB Transaction or Babble Detected, etc), and the final event is
> >> Success.
> >>
> >> The xHCI driver will handle the TD after the first event and remove it
> >> from its internal list, and then print an "Transfer event TRB DMA ptr
> >> not part of current TD" error message after the final event.
> >>
> >> Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a
> >> transaction error mid TD.") is designed to address isoc transaction
> >> errors, but unfortunately it doesn't account for this scenario.
> >>
> >> To work around this by reusing the logic that handles isoc transaction
> >> errors, but continuing to wait for the final event when this condition
> >> occurs. Sometimes we see the Stopped event after an error mid TD, this
> >> is a normal event for a pending TD and we can think of it as the final
> >> event we are waiting for.
> >
> > Not giving back the TD when we get an event for the last TRB in the
> > TD sounds risky. With this change we assume all old and future ETRON hosts
> > will trigger this additional spurious success event.
> >
> > I think we could handle this more like the XHCI_SPURIOUS_SUCCESS case seen
> > with short transfers, and just silence the error message.
> >
> > Are there any other issues besides the error message seen?
> >
>
> Would something like this work: (untested, compiles)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 965bffce301e..81d45ddebace 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -2644,6 +2644,22 @@ static int handle_transferless_tx_event(struct xhci_hcd *xhci, struct xhci_virt_
>          return 0;
>   }
>
> +static bool xhci_spurious_success_tx_event(struct xhci_hcd *xhci,
> +                                          struct xhci_ring *ring)
> +{
> +       switch(ring->last_td_comp_code) {
> +       case COMP_SHORT_PACKET:
> +               return xhci->quirks & XHCI_SPURIOUS_SUCCESS;
> +       case COMP_USB_TRANSACTION_ERROR:
> +       case COMP_BABBLE_DETECTED_ERROR:
> +       case COMP_ISOCH_BUFFER_OVERRUN:
> +               return xhci->quirks & XHCI_ETRON_HOST &&
> +                       ring->type == TYPE_ISOC;
> +       default:
> +               return false;
> +       }
> +}
> +
>   /*
>    * If this function returns an error condition, it means it got a Transfer
>    * event with a corrupted Slot ID, Endpoint ID, or TRB DMA address.
> @@ -2697,8 +2713,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
>          case COMP_SUCCESS:
>                  if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) {
>                          trb_comp_code = COMP_SHORT_PACKET;
> -                       xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td short %d\n",
> -                                slot_id, ep_index, ep_ring->last_td_was_short);
> +                       xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td comp code %d\n",
> +                                slot_id, ep_index, ep_ring->last_td_comp_code);
>                  }
>                  break;
>          case COMP_SHORT_PACKET:
> @@ -2846,7 +2862,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
>                   */
>                  if (trb_comp_code != COMP_STOPPED &&
>                      trb_comp_code != COMP_STOPPED_LENGTH_INVALID &&
> -                   !ep_ring->last_td_was_short) {
> +                   !xhci_spurious_success_tx_event(xhci, ep_ring)) {
>                          xhci_warn(xhci, "Event TRB for slot %u ep %u with no TDs queued\n",
>                                    slot_id, ep_index);
>                  }
> @@ -2890,11 +2906,10 @@ static int handle_tx_event(struct xhci_hcd *xhci,
>
>                          /*
>                           * Some hosts give a spurious success event after a short
> -                        * transfer. Ignore it.
> +                        * transfer or error on last TRB. Ignore it.
>                           */
> -                       if ((xhci->quirks & XHCI_SPURIOUS_SUCCESS) &&
> -                           ep_ring->last_td_was_short) {
> -                               ep_ring->last_td_was_short = false;
> +                       if (xhci_spurious_success_tx_event(xhci, ep_ring)) {
> +                               ep_ring->last_td_comp_code = trb_comp_code;
>                                  return 0;
>                          }
>
> @@ -2922,10 +2937,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
>           */
>          } while (ep->skip);
>
> -       if (trb_comp_code == COMP_SHORT_PACKET)
> -               ep_ring->last_td_was_short = true;
> -       else
> -               ep_ring->last_td_was_short = false;
> +       ep_ring->last_td_comp_code = trb_comp_code;
>
>          ep_trb = &ep_seg->trbs[(ep_trb_dma - ep_seg->dma) / sizeof(*ep_trb)];
>          trace_xhci_handle_transfer(ep_ring, (struct xhci_generic_trb *) ep_trb, ep_trb_dma);
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 779b01dee068..acc8d3c7a199 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1371,7 +1371,7 @@ struct xhci_ring {
>          unsigned int            num_trbs_free; /* used only by xhci DbC */
>          unsigned int            bounce_buf_len;
>          enum xhci_ring_type     type;
> -       bool                    last_td_was_short;
> +       u32                     last_td_comp_code;
>          struct radix_tree_root  *trb_address_map;
>   };
>

It looks better than my patch v2, I will test it later.

Thanks,
Kuangyi Chiang





[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux