Re: Misbehaving Alder Lake-N PCH USB 3.2 xHCI Host Controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25.8.2024 0.22, FPS wrote:


On 8/23/24 1:43 PM, FPS wrote:
 > If possible I'd like to see xhci traces of this issue.

Here is a curl'able trace (from running timeout -s KILL 0.1 jackd -R -P
70 -d alsa -d hw:USB -p 48 -n 2 (thanks to Michal for the suggestion)):

https://uni-bielefeld.sciebo.de/s/1ixBodEpmSg8M2k/download

Or should I include it inline (it's ca. 1M.)

OK, that trace is possibly unnecessarily complex. Maybe it makes sense
to look at a more isolated issue first. I mentioned before that just
using the capture direction with jackd seems to be fine, but the
playback direction is broken. It is broken in a somewhat "weird" way.
Jack doesn't report XRuns, but playback is definitely broken (Sound
plays back as if there is a zero buffer inserted after every buffer and
with ca. half the speed.

It's a bit complex, but it shows the missed service events started from
out transfers.

Quick background, audio class driver queues URBs. One URB containis one or several
TDs (transfer descriptors). One TD is consumed each interval, in this case each 125us.
In this case audio driver uses two URBs, usually containing 7 or 8 TDs, each transferring
48 bytes. TD (Transfer descriptor) equals one TRB (Transfer Block) in this particular case.


USB audio driver queues two URBs:
  185.943950: xhci_urb_enqueue: ep1out-isoc: urb 000000007faac5c4 pipe 33792 slot 3 length 0/384  8 x 48 bytes 0x10a3f5000 - 0x10a3f5070
  185.943966: xhci_urb_enqueue: ep1out-isoc: urb 0000000094a36c26 pipe 33792 slot 3 length 0/336  7 x 48 bytes 0x10a3f5080 - 0x10a3f50e0

xHCI gives back URBs once completed. About 1ms apart which is ok.
Odd that audio driver doesn't queue back the fist URB immediately after receiving it, this could cause missed service event.
  185.946834: xhci_urb_giveback: ep1out-isoc: urb 000000007faac5c4 pipe 33792 slot 3 length 384/384
  185.947705: xhci_urb_giveback: ep1out-isoc: urb 0000000094a36c26 pipe 33792 slot 3 length 336/336

Audio driver queues both URBs after it has processed their content.
very odd that the other URB only contains 1 TD (covering only 125us)
  185.947732: xhci_urb_enqueue: ep1out-isoc: urb 000000007faac5c4 pipe 33792 slot 3 length 0/48  1 x 48 bytes (0x10a3f50f0)
  185.947745: xhci_urb_enqueue: ep1out-isoc: urb 0000000094a36c26 pipe 33792 slot 3 length 0/336 7 x 48 bytes (0x10a3f5100 - 10a3f5160)

Audio driver cancels both URBs, Odd, why does it do that?
  185.949843: xhci_urb_dequeue: ep1out-isoc: urb 000000007faac5c4 pipe 33792 slot 3 length 0/48    CANCEL why?? (0x10a3f50f0)
  185.949848: xhci_urb_dequeue: ep1out-isoc: urb 0000000094a36c26 pipe 33792 slot 3 length 0/336   CANCEL why?? (0x10a3f5100 - 10a3f5160) turned to no-ops, deq at af5100

xhci gives back both cancelled URBs (empty)
  185.949902: xhci_urb_giveback: ep1out-isoc: urb 0000000094a36c26 pipe 33792 slot 3 length 0/336
  185.949908: xhci_urb_giveback: ep1out-isoc: urb 000000007faac5c4 pipe 33792 slot 3 length 0/48

Audio driver queues both URBs, almost at the same time
  185.950017: xhci_urb_enqueue: ep1out-isoc: urb 000000007faac5c4 pipe 33792 slot 3 length 0/384  8 x 48 bytes (10a3f5170 - 10a3f51e0)
  185.950031: xhci_urb_enqueue: ep1out-isoc: urb 0000000094a36c26 pipe 33792 slot 3 length 0/336  7 x 48 bytes (10a3f51f0 - 10a3f5250)

xhci driver successfully completes the first URB.
  185.952765: xhci_urb_giveback: ep1out-isoc: urb 000000007faac5c4 pipe 33792 slot 3 length 384/384

xhci reports missed Service error for the first TD (TRB) of the second URB (at buffer location 000000010a3f51f0)
  185.952876: xhci_handle_event: EVENT: TRB 000000010a3f51f0 status 'Missed Service Error' len 48 slot 3 ep 2 type 'Transfer Event' flags e:C

This is again odd.
The two last URBs were queued at almost the same time, and the TDs of the first URBs
completed successfully. I would understand a 'Missed service event' at an earlier stage,
for example at  0x10a3f50f0.

Maybe the missed service event TRB pointer isn't that exact.
Also, audio driver cancelling URBs right after queuing them does not help.


I ran jackd like this:

timeout -s KILL 2 jackd -R -P 70 -d alsa -d hw:USB -p 48 -n 2 -P 2>&1 |
logger

This gave me (somewhat imprecise) timestamps for the jack output:

[119172.966606] arch97 fps[177753]: jackd 0.126.0
[119172.966616] arch97 fps[177753]: Copyright 2001-2009 Paul Davis,
Stephane Letz, Jack O'Quinn, Torben Hohn and others.
[119172.966619] arch97 fps[177753]: jackd comes with ABSOLUTELY NO WARRANTY
[119172.966621] arch97 fps[177753]: This is free software, and you are
welcome to redistribute it
[119172.966622] arch97 fps[177753]: under certain conditions; see the
file COPYING for details
[119172.966624] arch97 fps[177753]:
[119172.973620] arch97 fps[177753]: no message buffer overruns
[119172.975503] arch97 fps[177753]: JACK compiled with System V SHM support.
[119172.989889] arch97 fps[177753]: loading driver ..
[119172.990672] arch97 fps[177753]: creating alsa driver ...
hw:USB|-|48|2|48000|0|0|nomon|swmeter|-|32bit
[119172.991245] arch97 fps[177753]: configuring for 48000Hz, period = 48
frames (1.0 ms), buffer = 2 periods
[119172.991249] arch97 fps[177753]: ALSA: final selected sample format
for playback: 32bit integer little-endian
[119172.991252] arch97 fps[177753]: ALSA: use 2 periods for playback
[119174.533438] arch97 fps[177753]: delay of 1003.000 usecs exceeds
estimated spare time of 966.000; restart ...
[119174.533463] arch97 fps[177753]:
[119174.536202] arch97 fps[177753]: delay of 1051.000 usecs exceeds
estimated spare time of 966.000; restart ...
[119174.536222] arch97 fps[177753]:
[119174.539134] arch97 fps[177753]: delay of 1060.000 usecs exceeds
estimated spare time of 966.000; restart ...
[119174.539152] arch97 fps[177753]:
[119174.542173] arch97 fps[177753]: delay of 1049.000 usecs exceeds
estimated spare time of 966.000; restart ...
[119174.542192] arch97 fps[177753]:
[119174.545145] arch97 fps[177753]: delay of 1047.000 usecs exceeds
estimated spare time of 966.000; restart ...
[119174.545164] arch97 fps[177753]:
[119174.548423] arch97 fps[177753]: delay of 993.000 usecs exceeds
estimated spare time of 966.000; restart ...
[119174.548447] arch97 fps[177753]:

which continues for a while until jackd is killed.. The timestamps allow
us to correlate (again, imprecisely) what's in the trace (curl'able link):

https://uni-bielefeld.sciebo.de/s/5gXyqY0cqNrQZdE/download

It is quite peculiar. Jack seems to start up fine and is done
configuring it's "driver" at 119172.991252 and seems to run fine until
119174.533438 which is ca. 1.5 secs later where weird things start to
happen. A delay of 1.003 ms is reported which is just above the 1 ms
period duration of 48 samples @ 48000 Hz sampling rate. So the last
buffer would have been sent off to the soundcard about 1 ms before this
time. Now looking at the trace I noticed a little thing. In the part pf
the trace before the excessive delays start I see things like this:

  irq/125-xhci_hc-169     [003] ..s.3 119173.246016:
xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep1out
  irq/125-xhci_hc-169     [003] b...3 119173.247898: xhci_handle_event:
EVENT: TRB 000000010a3f14d0 status 'Success' len 0 slot 3 ep 2 type
'Transfer Event' flags e:c

or

  irq/125-xhci_hc-169     [003] ..s.3 119173.248024:
xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep1out
  irq/125-xhci_hc-169     [003] b...3 119173.248883: xhci_handle_event:
EVENT: TRB 000000010a3f1560 status 'Success' len 0 slot 3 ep 2 type
'Transfer Event' flags e:c

The time between ringing the doorbell and the next xhci_handle_event is
mostly below 2 ms. But lateron I see things like this:

  irq/125-xhci_hc-169     [003] ..s.3 119174.323020:
xhci_ring_ep_doorbell: Ring doorbell for Slot 3 ep1out
  irq/125-xhci_hc-169     [003] b...3 119174.324897: xhci_handle_event:
EVENT: TRB 000000010a3f1d30 status 'Success' len 0 slot 3 ep 2 type
'Transfer Event' flags e:C

Suddenly this time has jumped to almost 3 ms. I'm still getting my head
around the low level USB stuff but possibly this is a hint?

It's interesting to see that there are no 'Missed service events' or
buffer under/overruns even if host clearly doesn't handle one TD per 125us.

10 returned URBs from last log show additional delays.
grep "xhci_urb_giveback: ep1out-isoc" trace_playback.txt:
...
119174.201925: xhci_urb_giveback: ep1out-isoc: urb 000000005f2dc90a pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.202916: xhci_urb_giveback: ep1out-isoc: urb 0000000032ac090e pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.204924: xhci_urb_giveback: ep1out-isoc: urb 000000005f2dc90a pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.205913: xhci_urb_giveback: ep1out-isoc: urb 0000000032ac090e pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.207926: xhci_urb_giveback: ep1out-isoc: urb 000000005f2dc90a pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.208912: xhci_urb_giveback: ep1out-isoc: urb 0000000032ac090e pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.210923: xhci_urb_giveback: ep1out-isoc: urb 000000005f2dc90a pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.211916: xhci_urb_giveback: ep1out-isoc: urb 0000000032ac090e pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.213925: xhci_urb_giveback: ep1out-isoc: urb 000000005f2dc90a pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004
119174.214912: xhci_urb_giveback: ep1out-isoc: urb 0000000032ac090e pipe 33792 slot 3 length 384/384 sgs 0/0 stream 0 flags 00000004

URBs contain 8TDs with 125us interval, and should complete 1ms apart on average.
This won't be true for every URB as sometimes urb giveback might be blocked for
a short while, but it should average out. Next URB would then return a bit earlier.

In this case it doesn't. URB are permanently delayed.

We use "SIA" (Start Isoc ASAP) flag for Isoc transfers, this allows controller
to queue TD to be transferred "as soon as possible" without forcing it to a
specific microframe.  Could be that this flag in combination with queueing TDs
very close to the IST limit causes the delay but not triggering missed service
events, or under/overruns.

Adding more URBs could help, but causes a bit more latency.

Or maybe see if we can get audio driver to use 3 URBs with 6TDs each?
Thanks
Mathias





[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux