On Mon, Aug 5, 2024 at 5:50 PM Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> wrote: > > On 5.8.2024 9.49, Mathias Nyman wrote: > > On 4.8.2024 0.51, Łukasz Bartosik wrote: > >> On Thu, Aug 1, 2024 at 5:02 PM Łukasz Bartosik <ukaszb@xxxxxxxxxxxx> wrote: > >>> > >>> [ 103.707882] xhci_hcd 0000:00:0d.0: Stall error at bulk TRB > >>> 1943ad000, remaining 1024, ep deq 1943ad001 > >>> > >>> > >>>> DbC should respond with STALL packets to host if HIT or HOT is set. > >>>> Host side should react to this by sending a ClearFeature(HALT) request to DbC, > >>>> which should clear the halted endpoint and HIT/HOT flags. > >>>> > >>> > >>> I would like to ask you about it again because I wonder when the DbC > >>> endpoint is not halted and it > >>> receives the ClearFeature(Halt) request whether this is correct > >>> behavior for the DbC endpoint to report a > >>> STALL error in such a case ? > >>> > > > > Ok, Setup was unclear to me, I assumed the STALL transfer error was due to an > > actual transfer issue on a bulk endpoint. > > > > I don't think xHCI DbC should send STALL error transfer events for non existing > > bulk transfers as a response to ClearFeature(ENDPOINT_HALT) control request. > > > > Especially as it seems not a single byte was transferred on either bulk endpoint. > > > > But we need to handle it in the driver properly > > > > > >>> Thanks, > >>> Lukasz > >>> > >> > >> Hi Mathias, > >> > >> One more thing which I would like to add to the previous is the observation > >> which I made during debugging the issue. Looking at the above trace > >> there is stall > >> on IN endpoint: > >> xhci_dbc_handle_event: EVENT: TRB 00000001943ad000 status 'Stall > >> Error' len 1024 slot 1 ep 3 type 'Transfer Event' flags e:C > >> > >> And then when 24 bytes packet arrives > >> xhci_dbc_handle_event: EVENT: TRB 00000001943ad000 status 'Short > >> Packet' len 1000 slot 1 ep 3 type 'Transfer Event' flags e:C > >> > >> the same TRB is being used 00000001943ad000 but the DbC driver already > >> moved forward and the IN endpoint's pending list does not include a > >> dbc_request pointing > >> to the TRB 00000001943ad000 which results in "no matched request" > >> error and dropping > >> of the packet. > >> > > > > I noticed yes, this is why the patch compared the TRB pointed to by the > > STALL transfer event with the one in the endpoint context. > > If they match the patch attempts to turn that TRB to no-op, forcing xHC > > hardware to move to the next TRB. > > > > The check did however not work as it didn't mask out the cycle bit. > > Lops show we compare 1943ad000 with 1943ad001 > > > > [ 103.707882] xhci_hcd 0000:00:0d.0: Stall error at bulk TRB 1943ad000, remaining 1024, ep deq 1943ad001 > > > > Code in patch: > > +if (ep_ctx->deq == event->trans_event.buffer) { // FIXME do we need to worry about cycle bit? > > + dev_warn(dbc->dev, "Stall error TRB matches ep_ctx->deq, clear it\n"); > > + trb_to_noop(req->trb); > > > > With the new information that the STALL transfer event can be completely > > spurious and not related to any actual bytes being transmitted on bulk > > endpoints I think we shouldn't give back he transfer req in those cases. > > > > I'll make another patch > > New patch ready for testing: > https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/commit/?h=fix_dbc_halted_ep&id=96cd909cc8115b3d2dff1bdcf265171bb0fdab18 > Hi Mathias, I finally tested your latest patch 96cd909cc8115b3d2dff1bdcf265171bb0fdab18. It resolves the issue. Here are trace logs from one of the runs: # entries-in-buffer/entries-written: 23/23 #P:12 # # _-----=> irqs-off/BH-disabled # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> migrate-disable # |||| / delay # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | kworker/4:0-30 [004] d..1. 427.334578: xhci_dbc_handle_event: EVENT: TRB 000000019dbcaaa0 status 'Stall Error' len 0 slot 1 ep 2 type 'Transfer Event' flags e:c kworker/4:0-30 [004] d..1. 427.340272: xhci_dbc_handle_event: EVENT: TRB 000000019aef5960 status 'Stall Error' len 1024 slot 1 ep 3 type 'Transfer Event' flags e:c kworker/4:0-30 [004] d..1. 427.340273: xhci_dbc_handle_transfer: BULK: Buffer 000000017255a000 length 1024 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:C kworker/4:0-30 [004] d..1. 427.382584: xhci_dbc_handle_event: EVENT: TRB 000000019aef5960 status 'Short Packet' len 1000 slot 1 ep 3 type 'Transfer Event' flags e:c kworker/4:0-30 [004] d..1. 427.382586: xhci_dbc_handle_transfer: BULK: Buffer 000000017255a000 length 1024 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:C kworker/4:0-30 [004] d..1. 427.382587: xhci_dbc_giveback_request: bulk-in: req 000000005c9bd064 length 24/1024 ==> -6 kworker/4:0-30 [004] d..1. 427.382592: xhci_dbc_handle_event: EVENT: TRB 000000019aef5970 status 'Short Packet' len 763 slot 1 ep 3 type 'Transfer Event' flags e:c kworker/4:0-30 [004] d..1. 427.382592: xhci_dbc_handle_transfer: BULK: Buffer 000000017255c000 length 1024 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:C kworker/4:0-30 [004] d..1. 427.382592: xhci_dbc_giveback_request: bulk-in: req 00000000f836aa56 length 261/1024 ==> 0 ksoftirqd/4-29 [004] d.s1. 427.387396: xhci_dbc_gadget_ep_queue: BULK: Buffer 000000017255c000 length 1024 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:c ksoftirqd/4-29 [004] d.s.. 427.387397: xhci_dbc_queue_request: bulk-in: req 00000000f836aa56 length 0/1024 ==> -115 ksoftirqd/4-29 [004] d.s1. 427.387397: xhci_dbc_gadget_ep_queue: BULK: Buffer 000000017255a000 length 1024 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:c ksoftirqd/4-29 [004] d.s.. 427.387398: xhci_dbc_queue_request: bulk-in: req 000000005c9bd064 length 0/1024 ==> -115 sock->usb-4089 [004] d..1. 427.388796: xhci_dbc_gadget_ep_queue: BULK: Buffer 000000017255d400 length 24 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:c sock->usb-4089 [004] d.... 427.388801: xhci_dbc_queue_request: bulk-out: req 00000000653a50a0 length 0/24 ==> -115 sock->usb-4089 [004] d..1. 427.388857: xhci_dbc_gadget_ep_queue: BULK: Buffer 000000017255d000 length 371 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:c sock->usb-4089 [004] d.... 427.388858: xhci_dbc_queue_request: bulk-out: req 000000006de24797 length 0/371 ==> -115 kworker/4:0-30 [004] d..1. 427.388869: xhci_dbc_handle_event: EVENT: TRB 000000019dbcaaa0 status 'Success' len 0 slot 1 ep 2 type 'Transfer Event' flags e:c kworker/4:0-30 [004] d..1. 427.388870: xhci_dbc_handle_transfer: BULK: Buffer 000000017255d400 length 24 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:C kworker/4:0-30 [004] d..1. 427.388870: xhci_dbc_giveback_request: bulk-out: req 00000000653a50a0 length 24/24 ==> 0 kworker/4:0-30 [004] d..1. 427.394600: xhci_dbc_handle_event: EVENT: TRB 000000019dbcaab0 status 'Success' len 0 slot 1 ep 2 type 'Transfer Event' flags e:c kworker/4:0-30 [004] d..1. 427.394601: xhci_dbc_handle_transfer: BULK: Buffer 000000017255d000 length 371 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:i:e:C kworker/4:0-30 [004] d..1. 427.394603: xhci_dbc_giveback_request: bulk-out: req 000000006de24797 length 371/371 ==> 0 Short packet was processed instead of being dropped as it happened previously without your fix. Thanks, Lukasz > Thanks > Mathias