alsa sink latency - how to account for startup delay

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 30.03.2016 18:06, Tanu Kaskinen wrote:
> On Tue, 2016-03-29 at 20:29 +0200, Georg Chini wrote:
>> On 29.03.2016 05:56, Georg Chini wrote:
>>> On 29.03.2016 02:13, Pierre-Louis Bossart wrote:
>>>> On 3/28/16 12:38 PM, Georg Chini wrote:
>>>>> On 28.03.2016 17:18, Georg Chini wrote:
>>>>>> On 28.03.2016 16:16, Pierre-Louis Bossart wrote:
>>>>>>> On 3/22/16 4:11 AM, Georg Chini wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>> Ups, looks like I misread your sentence above. You are right, in
>>>>> software you can't
>>>>> see the output. But what you can see is the time between dispatching
>>>>> the
>>>>> audio
>>>>> to the USB bus and the time when the bus reports back that the audio
>>>>> was
>>>>> played.
>>>>>
>>>>> I am not sure if I read the code right, I don't know anything about
>>>>> USB,
>>>>> but if the
>>>>> URB's you are submitting to the bus in the prepare step are handled in
>>>>> chronological
>>>>> order, it means that
>>>>> a) you have to wait for the next URB to retire before you can send any
>>>>> real audio
>>>>> b) after you submitted the first audio, it is at the end of the queue
>>>>> and all the other
>>>>> URB's containing silence will be processed first.
>>>> The USB driver will submit N silence URBs on startup in the prepare
>>>> and you will have to wait for those URBs to retire before the samples
>>>> are queued. There is very little 'USB processing'. If you want to
>>>> reduce this delay you have to use smaller periods, it'll decrease the
>>>> size of the URBs. I guess it could be possible to change the URB size
>>>> after the start but that's not implemented atm.
>>>>
>>> I don't want to shorten the latency. I only want the latency reported
>>> correctly. To me it still
>>> looks like the real latency of the driver is not what it reports,
>>> because the time that the
>>> audio spends in the URB's is not taken into account. What I am seeing
>>> is, that the real
>>> latency is around 10ms longer than expected.
>> OK, I thought about it another time and I believe I figured out what the
>> problem is.
>>
>> (Sometimes I feel like I am talking Chinese to you - I seem not to be
>> able to express
>> myself in a way that others can understand, sorry for that. I'll try my
>> best now.
>> If we would sit in one room it would probably be sorted out in half an
>> hour.)
>>
>> The starting point is the observation, that with module-loopback and an
>> USB sink
>> the real end-to-end latency is about 10ms longer than configured. This
>> cannot be
>> observed with a HDA sink. Let's go through the sequence of events when
>> module-loopback is started:
> I'm interested in what delay values alsa reports, so I'll write what I
> expect the delay to be after each step. One unclear thing is that can
> the driver report the delay at higher granularity than one URB (= 5 ms
> in this example)? I'll assume here that it can report the latency with
> one sample accuracy.

I had one big mistake in my thinking - it does not matter at all
when the source starts delivering samples to module-loopback.
The samples that the source provides are buffered in the memblockq
and in the end module-loopback adjusts the memblockq so that
sink latency + source latency + buffer = configured loopback latency.
That means that even in your previous example where you start with
8 seconds initial latency, the controller in module-loopback would
correct it after some time.
The latency that is made up by the contents of the memblockq
(and the render memblockq's on the sink input/source output side)
can be exactly calculated, so if the resulting overall latency is wrong,
the error must be on the source or sink side. In this case it must be
the sink side because changing the source has no impact.
Nevertheless the sequence below is interesting, because it shows,
how the latency is calculated on the alsa side. I'll attach a log with
some debugging information, numbers without units are usec.

>
> I'll also assume that audio is moved from the ring buffer to the URBs
> atomically, so the ring buffer fill level jumps by 5 ms every time a
> URB is filled.
>
>> 1) The module sets up a buffer and fills it with silence up to the
>> configured loopback latency.
>> Sink and source are configured to 1/3 of the loopback latency. In this
>> example, we will use
>> 30ms as loopback latency, so that sink and source are configured to 10ms.
> I suppose prepare() has not been called in the sink yet, so currently
> the delay is undefined.
>
>> 2) Source and sink are started. For simplicity we assume that the source
>> starts delivering
>> samples immediately.
> I guess the sink still hasn't called prepare()?
Somewhere here the sink calls prepare.
>
>> 3) As long as sink_input_pop() has not been called, module-loopback
>> drops the audio
>> coming in from the source.
>>
>> 4) On the alsa side, the preparation has been done. Again, to simplify
>> we assume 2 URB's
>> with 5ms each have been set up and submitted to the USB bus. The first
>> URB starts
>> playing silence to the speakers now.
> prepare() has been called. If we assume that the DAC has not yet
> consumed anything from the first URB, the "USB bus" latency is 10 ms.
> PulseAudio hasn't written anything yet to the ring buffer. Expected
> delay: 10 ms.
I think the delay is still undefined here. No audio has been sent
and the stream is not even started.

>
>> 5) The pulseaudio sink code takes the first 10ms of audio out of the
>> loopback buffer,
>> writes it to the alsa buffer and calls snd_pcm_start().
> If the sink takes something from the loopback buffer, this means that
> the first pop() call has been made. Assuming no time has passed since
> the previous step, the USB bus is still full, and so is the ring
> buffer. Expected delay: 20 ms.

Reported delay is exactly the amount of audio that was written to
the buffer.

>
>> 6) Now sink_input_pop() is called, and that is where things go wrong,
>> because the
>> audio from the source is buffered from now on.
> Is this already the second pop() call? That means that at least one URB
> has been filled from the ring buffer, otherwise pop() wouldn't have
> been called due to the ring buffer being full.

No, the order was wrong, I meant the first pop() callback. But
this does not matter anyway - see above.

>
> During the pop() call the ring buffer contains 0 ms of audio, if two
> URBs have been filled from the ring buffer, or 5 ms, if only one has
> been filled.
>
> Expected delay during the pop() call: 0 or 5 ms + whatever the DAC
> hasn't yet consumed from the USB bus. Expected delay after the sink has
> finished rendering and writing to the ring buffer: 10 ms + whatever the
> DAC hasn't yet consumed from the USB bus.

Since I got it wrong and we are still at the first pop() call, the
delay is still 10ms.

>
>> 7) On the alsa side nothing has happened yet. The reported delay is what
>> is in the alsa
>> buffer - 10ms. Alsa has to wait for the first URB to expire before any
>> sample can be
>> submitted to the bus.
> How can nothing have happened yet on the alsa side? pop() has been
> called twice already, which means that the alsa driver has already
> consumed audio from the ring buffer. I'd count that as something
> happening.
See above - my mistake. But it looks like pop() is really called
a second time before the delay starts going down - you can see
in the attached log, that first there are two writes to the alsa
buffer - one to fill up to the configured sink latency and then
another one that shouldn't be there. I could not figure out
why this happens. The code looks fine and you already checked
yourself that this shouldn't happen. The second write to the
alsa buffer increases the delay to 19ms, which is a lot more
than the configured 13 ms, but the 6 ms difference still
not account for what I measure. Without using my "startup delay"
I am about 15 ms off.

>
> If the first URB hasn't yet expired (ignoring the conflicting
> information that pop() has been called twice already), it doesn't make
> sense that alsa would report 10 ms delay. The two submitted URBs add 10
> ms to the delay, minus everything that the DAC has consumed at this
> point, which I assume to be very little. On top of that there's 10 ms
> of audio written to the ring buffer. If alsa reports the full delay to
> be 10 ms, that's wrong, don't you think? It should report 20 ms (or
> something close to that to account for whatever the DAC has consumed).
No, it doesn't. It reports exactly the delay that is made up of the
audio written to the alsa buffer and nothing more.

>
>> 8) Now the first URB has retired and can be filled with audio. After the
>> URB is submitted to
>> the bus, the alsa delay is NOT updated and still shows 10ms. The second
>> URB starts playing
>> silence for 5ms. Because the buffer seems filled to the correct level,
>> the pulseaudio sink
>> code will wait until the fill level gets lower.
> When the first URB is retired and filled with audio from the ring
> buffer, that's when the ring buffer fill level jumps from 10 ms to 5
> ms. You seem to be saying that after this has happened, the alsa sink
> doesn't fill up the ring buffer. Why is that? 5 ms is half of the full
> buffer, and that should be low enough to trigger a refill.

Maybe the ring buffer fill level changes at that point - I did not check.
That could be the reason for the refill I mentioned above and also
the reason why there is no underrun later.

>
>> 9) The second URB has retired, is filled with audio and submitted to the
>> bus. Still the
>> delay is NOT updated, but from now on, it is correct, since a sample
>> that would now be
>> submitted to the alsa buffer would take 10 ms to the speakers. The first
>> real audio starts
>> playing.
> Here you seem to be describing a situation where the ring buffer is
> empty, and indeed, in that situation the reported delay should be 10
> ms. I just don't understand why the ring buffer would be empty (that
> is, why didn't pulseaudio refill the ring buffer after the first URB
> was consumed from the ring buffer).
It looks like it did - see above. But still the delay shows the same
value.

>
>> 10) Meanwhile the source has been producing samples that have been queued in
>> the loopback buffer and that make up the extra latency I have been
>> observing. In
>> this example it would be 5ms from the second URB plus the time that was left
>> from the first URB when snd_pcm_start() was called.
>>
>> 11) The first URB-full of audio has been played and the URB retires. Now
>> the alsa
>> delay is adjusted to reflect that the first 5ms have been played and the
>> audio flow
>> is completely set up.
> If the ring buffer was empty in the previous step, and we didn't write
> anything after that, then the retirement of an URB now would cause an
> underrun, since the retired URB can't be refilled from the ring buffer.
Yes that was also bothering me a while after I wrote this mail. But
it looks like the explanation is that the alsa driver does change the
ring buffer fill level, but not the delay.

>
>> If the first sink_input_pop() would be called at this point, everything
>> would be fine.
>> So the problem seems to be that sink_input_pop() is called too early.
> I don't know, your description contained many strange parts. I don't
> know what is really happening. But if module-loopback heavily relies on
> the configured latency to be the same as the real latency, then that's
> going to cause trouble anyway, because they just aren't the same thing,
> even if the alsa driver is reporting everything perfectly.
>
Module loopback relies on the reported latencies, the configured
latencies are only used during setup. If the reported latencies
don't match the configured it does not matter.

The attached logs are with my own smoother code, but I have
changed nothing else in the alsa sink, so the timing is the same.
BTW, do you think the debug output of module-loopback is
better now?
The reports of write count / delay reported by alsa stop at the
point where there is the jump in the delay that I mentioned in
another mail. In this log it is not that obvious, but the strange
thing is, that the card time goes on linear and the system time
shows the jump. No idea what is happening there.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loopback.log
Type: text/x-log
Size: 69133 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/pulseaudio-discuss/attachments/20160330/589e6298/attachment-0001.bin>


[Index of Archives]     [Linux Audio Users]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux