Hi! Am Mon, 29 Jun 2020 20:47:24 +0300 schrieb Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>: > First issue I see is that the attempt to recover from a transaction > error with a soft retry isn't working. We expect the hardware to > retry the transfer but nothing seems to happen. Soft retry is > described in xhci specs 4.6.8.1 and is basically a reset endpoint > command with TSP set, followed by ringing the endpoint doorbell. > Traces indicate driver does this correctly but hardware isn't > retrying. We get don't get any event, no error, success or stall. > > This could be hardware flaw. > Any chance you could try this on a xHC from some other vendor? There is no other xHC hardware available to me. > Second issue is a driver flaw, when nothing happened for 20 seconds > we see the URB is canceled. xhci driver needs to stop then endpoint > to cancel the URB, but there is a hw race and endpoint ends up halted > instead of stopped. The xhci driver can't handle a halted endpoint in > its stop endpoint handler properly, and the URB is never actually > removed from the ring. > > The reason you see the IO_PAGE_FAULT is probably because once the > ring starts running the driver will handle the cancelled URB, and > touch already freed memory: AMD-Vi: Event logged [IO_PAGE_FAULT > domain=0x000d address=0xdc707028 flags=0x0020] > > I have a patch for this second case, I haven't upstreamed it as it > got some conflicting feedback earlier. It won't solve the 20 second > delay, but should solve the the IO_PAGE_FAULT and the "WARN Set TR > Deq Ptr cmd failed due to incorrect slot or ep state" message > > Can you try it out? I successful applied the patch against Linux 5.7.4, but get this error when compiling drivers/usb/host/xhci-ring.c: CC [M] drivers/usb/host/xhci-ring.o drivers/usb/host/xhci-ring.c: In function ‘xhci_handle_cmd_stop_ep’: drivers/usb/host/xhci-ring.c:857:3: error: implicit declaration of function ‘xhci_reset_halted_ep’ [-Werror=implicit-function-declaration] 857 | xhci_reset_halted_ep(xhci, slot_id, ep_index, reset_type); | ^~~~~~~~~~~~~~~~~~~~ Fabian