On Thu, May 6, 2021 at 4:06 AM Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> wrote: > > On 5.5.2021 10.56, Ole Salscheider wrote: > > Hi Mathias, > > > > ... > > > >>> How about a different approach? > >>> If the issue is only with transfers starting on the last TRB before the link TRB, we could turn that TRB to a no-op. > >>> Does something like the code below help? > >>> > >>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > >>> index 6cdea0d00d19..0ffda8127640 100644 > >>> --- a/drivers/usb/host/xhci-ring.c > >>> +++ b/drivers/usb/host/xhci-ring.c > >>> @@ -3181,6 +3181,12 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring, > >>> } > >>> } > >>> + if (ep_ring != xhci->cmd_ring && > >>> + !trb_is_link(ep_ring->enqueue) && > >>> + trb_is_link(ep_ring->enqueue + 1)) > >>> + queue_trb(xhci, ep_ring, 0, 0, 0, 0, > >>> + TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state); > >>> + > >>> while (trb_is_link(ep_ring->enqueue)) { > >>> /* If we're not dealing with 0.95 hardware or isoc rings > >>> * on AMD 0.96 host, clear the chain bit. > >> > >> Your patch seems to work. I can record video with this and it seems stable so far. > >> > >> But there is still something off (as with my patch): If I stop the video recording and try to record again, the camera does not give me any frames. Maybe this is an unrelated issue but it works fine on the two other host controllers that I tested. > >> > >> If you are interested you can find a trace here: > >> https://stuff.salscheider.org/dmesg_second > >> https://stuff.salscheider.org/trace_second > >> > >> In this trace I recorded a few seconds of video with ffmpeg, killed it (at second 108) and restarted it (at second 116). Can you see anything suspicious in the trace? > > > > I guess this second issue is unrelated. The cameras have worked stable so far with your patch. It might be good to include this workaround in mainline. Will you take care of it or should I send something to the list? > > > > This is still not a very nice solution. We have no clue about the actual rootcause. > > I remember now there was a similar issue with an earlier ASMedia host some years ago. > This was fixed by modifying some internal flowcontol parameters of the host in: > > 9da5a1092b13 xhci: Bad Ethernet performance plugged in ASM1042A host > > Not sure if Jiahau Chang (cc) works on this anymore, but maybe he knows who to contact. > Also adding Forest Crossman who has committed ASMediad fixes lately > > Any clue about the rootcause? > thread: > https://lore.kernel.org/linux-usb/20210416093729.41865-1-ole@xxxxxxxxxxxxxxx Unfortunately, I don't know what could be causing this. The only thing I would suggest is to see if this problem happens (without the patch) while the USB device is connected directly to a port on the ASMedia host controller, with no other hubs or devices connected to that controller. The only problem I've been seeing with my various ASMedia cards is when I try to do a lot of bulk reads from multiple devices simultaneously (e.g., when dd-ing from multiple hard drives to /dev/null). In those cases, the controller eventually triggers an IOMMU page access violation, which causes the kernel to reset the PCIe endpoint. So if the camera works fine when it's the only device connected to the host controller (without any patches applied), that might indicate that this is the same issue. But that's mostly a wild guess--I don't know enough of the USB or xHCI standards to really understand what's going on. Best of luck resolving this issue, Forest