Hi
On 21.04.21 09:28, Mathias Nyman wrote:
On 19.4.2021 22.05, Ole Salscheider wrote:
Hi Mathias,
...
tracing shows the content of the TRB while it's written, before writing the cycle bit.
Could be worth dumping transfer ring after issue is seen and check TRB still looks valid:
cat /sys/kernel/debug/usb/xhci/<your controller>/devices/<xx>/ep<yy>/trbs
you need to figure out which controller, device and endpoint number for this.
I think this is the output for the correct device and endpoint:
https://stuff.salscheider.org/trbs
Thanks, the TRB was already turned No-op by the cancel code by then, but it doesn't matter,
your trace below indicates the TRB is fine.
How about your delayed link TRB cycle write patch? did it help trigger an event for the last transfer
TRB, or did it just help as controller couldn't go past the link TRB and was less out of sync with
the driver?
You can find a trace of recording a few seconds of video with my patch here:
https://stuff.salscheider.org/trace_with_patch
As far as I can tell from it, the patch helps to trigger an event for all transfer TRBs. I also tried to record around 30 min of video with it and it did not encounter a problem while recording.
I however cannot start a new recording after stopping the first. And sometimes I observe hangs of the computer some time after stopping the recording. That might be some unwanted side-effect of my patch...
Yes, It shows transfer events for the TRBs before the link TRB.
This workaround is however a can of worms as many TD's consist of several TRBs, and they only trigger events for the last TRB.
How about a different approach?
If the issue is only with transfers starting on the last TRB before the link TRB, we could turn that TRB to a no-op.
Does something like the code below help?
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 6cdea0d00d19..0ffda8127640 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3181,6 +3181,12 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
}
}
+ if (ep_ring != xhci->cmd_ring &&
+ !trb_is_link(ep_ring->enqueue) &&
+ trb_is_link(ep_ring->enqueue + 1))
+ queue_trb(xhci, ep_ring, 0, 0, 0, 0,
+ TRB_TYPE(TRB_TR_NOOP) | ep_ring->cycle_state);
+
while (trb_is_link(ep_ring->enqueue)) {
/* If we're not dealing with 0.95 hardware or isoc rings
* on AMD 0.96 host, clear the chain bit.
Your patch seems to work. I can record video with this and it seems
stable so far.
But there is still something off (as with my patch): If I stop the video
recording and try to record again, the camera does not give me any
frames. Maybe this is an unrelated issue but it works fine on the two
other host controllers that I tested.
If you are interested you can find a trace here:
https://stuff.salscheider.org/dmesg_second
https://stuff.salscheider.org/trace_second
In this trace I recorded a few seconds of video with ffmpeg, killed it
(at second 108) and restarted it (at second 116). Can you see anything
suspicious in the trace?
Thanks a lot
- Ole