On 09.05.21 02:01, Forest Crossman wrote:
On Thu, May 6, 2021 at 4:06 AM Mathias Nyman
<mathias.nyman@xxxxxxxxxxxxxxx> wrote:
On 5.5.2021 10.56, Ole Salscheider wrote:
Hi Mathias,
...
How about a different approach?
If the issue is only with transfers starting on the last TRB before the link TRB, we could turn that TRB to a no-op.
Does something like the code below help?
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 6cdea0d00d19..0ffda8127640 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3181,6 +3181,12 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
}
}
+ if (ep_ring != xhci->cmd_ring &&
+ !trb_is_link(ep_ring->enqueue) &&
+ trb_is_link(ep_ring->enqueue + 1))
+ queue_trb(xhci, ep_ring, 0, 0, 0, 0,
+ TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state);
+
while (trb_is_link(ep_ring->enqueue)) {
/* If we're not dealing with 0.95 hardware or isoc rings
* on AMD 0.96 host, clear the chain bit.
Your patch seems to work. I can record video with this and it seems stable so far.
But there is still something off (as with my patch): If I stop the video recording and try to record again, the camera does not give me any frames. Maybe this is an unrelated issue but it works fine on the two other host controllers that I tested.
If you are interested you can find a trace here:
https://stuff.salscheider.org/dmesg_second
https://stuff.salscheider.org/trace_second
In this trace I recorded a few seconds of video with ffmpeg, killed it (at second 108) and restarted it (at second 116). Can you see anything suspicious in the trace?
I guess this second issue is unrelated. The cameras have worked stable so far with your patch. It might be good to include this workaround in mainline. Will you take care of it or should I send something to the list?
This is still not a very nice solution. We have no clue about the actual rootcause.
I remember now there was a similar issue with an earlier ASMedia host some years ago.
This was fixed by modifying some internal flowcontol parameters of the host in:
9da5a1092b13 xhci: Bad Ethernet performance plugged in ASM1042A host
Not sure if Jiahau Chang (cc) works on this anymore, but maybe he knows who to contact.
Also adding Forest Crossman who has committed ASMediad fixes lately
Any clue about the rootcause?
thread:
https://lore.kernel.org/linux-usb/20210416093729.41865-1-ole@xxxxxxxxxxxxxxx
Unfortunately, I don't know what could be causing this. The only thing
I would suggest is to see if this problem happens (without the patch)
while the USB device is connected directly to a port on the ASMedia
host controller, with no other hubs or devices connected to that
controller. The only problem I've been seeing with my various ASMedia
cards is when I try to do a lot of bulk reads from multiple devices
simultaneously (e.g., when dd-ing from multiple hard drives to
/dev/null). In those cases, the controller eventually triggers an
IOMMU page access violation, which causes the kernel to reset the PCIe
endpoint. So if the camera works fine when it's the only device
connected to the host controller (without any patches applied), that
might indicate that this is the same issue. But that's mostly a wild
guess--I don't know enough of the USB or xHCI standards to really
understand what's going on.
The problem occurs here also if only one camera and no other device are
connected to the ASMedia host controller.
Best of luck resolving this issue,
Forest