[PATCH 00/13] loopback: Make module-loopback honor requested latency (v5)

georg@xxxxxxxx (Georg Chini) · Wed, 18 Nov 2015 20:43:33 +0100

On 15.11.2015 22:08, Alexander E. Patrakov wrote:
> 25.02.2015 23:43, Georg Chini wrote:
>> Hello,
>>
>> this is the split version of the module-loopback patch. The patch 
>> optimizes the latency
>> initialization and regulation of the module.
>> Many thanks to Alexander Patrakov for splitting and reviewing the 
>> patch and also for
>> his contribution to the code.
>>
>> Georg Chini (13):
>>    loopback: Fix the obviously-wrong "buffer+=buffer" logic
>>    loopback: Save the timestamps when we snapshot latency
>>    loopback: Improved estimation of latency
>>    loopback: Adjust rates based on latency difference
>>    loopback: Don't track max_request and min_memblockq_length
>>    loopback: Restart the timer right away
>>    loopback: Refactor latency initialization
>>    loopback: Track underruns and cant-peek events
>>    loopback: Track the amount of jitter
>>    loopback: Added a deadband to reduce rate hunting
>>    loopback: Don't change rate abruptly
>>    loopback: Validate the rate parameter
>>    loopback: add parameter buffer_latency_msec
>>
>>   src/modules/module-loopback.c | 578 
>> ++++++++++++++++++++++++++++++++----------
>>   src/pulse/sample.c            |   5 +-
>>   2 files changed, 443 insertions(+), 140 deletions(-)
>>
>
> I have tested this code again. Here is my opinion.

Thank you a lot!

>
>
> It would be logical enough to consider the state of the original 
> loopback module, and compare the state of affairs after 04/13 and 
> after 13/13 to it. And compare to other implementations of adaptive 
> resamplers as well.
>
> To test the thing, I used my laptop (that has an internal sound card) 
> and my desktop (that has an internal PCH soundcard, also I connected 
> my Rotel RA-1570 amplifier via USB).
>
> On the laptop, I created a 5-minute wav file containing a sine wave 
> with 1 kHz frequency. I connected the laptop headphone output to the 
> desktop PCH line input, and used the loopback module to transfer audio 
> from PCH to the Rotel amplifier. In all cases, the nominal sample rate 
> 44100 Hz was used. In all cases, the sound sent to the Rotel amplifier 
> was recorded into a wav file, and the resulting file was analysed with 
> a Python script.
>
> I have tried two techniques for instrumenting the loopback module.
>
> First, I thought that I can record the sound sent to the Rotel sound 
> card by recording from PulseAudio monitor source. However, this turned 
> out to be a bad idea: the existing loopback module behaved differently 
> (according to its log) when its sink was monitored. Namely, it 
> immediately converged to 44100 Hz - something that it could do only 
> after two minutes without such monitoring.
>
> So, I created this snippet in .asoundrc:
>
> pcm.savefile {
>     type file
>     format "wav"
>     slave.pcm "hw:PCUSB"
>     file "/tmp/save.wav"
> }
>
> and loaded module-alsa-sink so that PulseAudio saved its output to a 
> file.
>
> The most important result is that the old module, both when monitored 
> and when not monitored, does not respect the requested latency in this 
> test case. It settles down to something between 300 and 400 ms, even 
> though 200 ms was requested. There is no such bug with the new 
> implementation.
>
> Now let's talk about the script. It is attached. A mono wav file 
> containing some silence, recording of a resampled 1 kHz sine wave, and 
> some silence again, should be passed as an argument. The script will 
> try to estimate the phase of the sine wave (y = A * sin(omega * t + 
> phi), the script tries to estimate phi) at various points in time, and 
> plot the result. The plot will generally contain some linear trend 
> (due to unsynchronized clocks) and some wobble around that. Then it 
> will plot again, with this linear trend eliminated (i.e. will estimate 
> the phase again using the correct signal frequency in terms of the 
> sink clock). Both results are saved as png files, twice (first the 
> full version, then only the second half, to hide the startup process).
>
> I chose this metric because it was used by Fons Adriaensen when 
> discussing the merits of zita-ajbridge:
>
> http://kokkinizita.linuxaudio.org/linuxaudio/zita-ajbridge-doc/quickguide.html 
>
>
> Both my plots and the plots on his page show the unwrapped phase in 
> degrees. Phase, in our case, is related to the delay. 360 degrees = 1 
> millisecond = 34 cm of air. In other words, 1 degree = 2.77 us = 0.94 
> mm of air.
>
>
> The first result (https://imgur.com/a/twBEk ) is actually without any 
> loopback module. Just a recording from the line input. We can clearly 
> see that there is some clock skew between the laptop and desktop 
> soundcards, and that the clock ratio is even not constant. OK - this 
> is a reference how good any loopback module can be.
>
> The second result (https://imgur.com/a/eVahQ ) is with the old module. 
> You see that, with it, the rate oscillates wildly and then snaps to 
> 44100 Hz. However, the final latency is not correct, and it 
> accumulates due to clock mismatch between the two sound cards in the 
> desktop. I have not waited enough to see what happens next, but the 
> result is clearly invalid.

I wonder why the phase is not continuously growing in the final plot. I 
would expect this
because the controller snaps to the base rate.

>
> The next result (https://imgur.com/a/Bi2Yz ) is with the new module, 
> as of PATCH 04/13. There is some "rate hunting", and it results to 
> some noise in phase. It is generally contained between -150 and 200 
> degrees.
>
> Next, let's see (https://imgur.com/a/yQGWL ) what the deadband 
> achieves. Result: it helps against rate hunting, but not much against 
> phase oscillations. This is because new_rate is snapped to base_rate, 
> which would only be correct if the cards actually shared the common 
> clock. So here is what happens: the module runs for some time with 1:1 
> resampling (i.e. with the rate strictly equal to 44100 Hz), and the 
> latency difference slowly drifts due to the clock mismatch. When the 
> difference becomes bigger than the error of its measurement, the 
> module corrects it, and then runs at 44100 Hz for some time again.
>

Could you use 2 s as adjust time? 10 seconds is definitely too slow to 
achieve good results.
I would expect a similar result but with much smaller amplitude of the 
phase oscillation.

>
> However, I'd argue that this phase metric can be improved without the 
> deadband and beyond what this deadband-based implementation provides.
>

What is the problem with the dead band? There are physical limitations 
to what makes
sense to correct and the dead band just takes care of those limits.

>
> First, see (https://imgur.com/a/P5Y0A ) what happens, at PATCH 04/13, 
> if we just decrease adjust_time to 1 second. The rate oscillations 
> become bigger, but phase oscillations stay of the same order as with 
> the default value of adjust_time (10 seconds). So not good.
>
> Then, let's change the rate controller so that it does not attempt to 
> correct all of the latency difference in one step. The value of 
> adjust_time is still 1s, but let's aim to correct the difference after 
> 10 steps by a factor of 2.71. For simplicity (and for testing purposes 
> only), let's destroy the non-linear part of the controller. Here is 
> the code:
>
> static uint32_t rate_controller(
>                 uint32_t base_rate,
>                 pa_usec_t adjust_time,
>                 int32_t latency_difference_usec) {
>
>     uint32_t new_rate;
>
>     new_rate = base_rate * (1.0 + 0.1 * 
> (double)latency_difference_usec / adjust_time);
>
>     return new_rate;
> }
>
> As you can see (https://imgur.com/a/eZT8L ), the phase oscillations 
> are now much more limited!
>
> I have received a private email from Georg where he proposes to use 
> the deadband, but snap to something other than base_rate, e.g. to a 
> result of some slow controller. For me, this is equivalent to using 
> this slow controller, i.e. a good idea.
>

I'll send you a patch privately later today. I kept the controller and 
implemented a more or less
independent controller which takes care of the clock skew / latency 
drift. I'd like to discuss
this approach first with you before I send another patch to the list.

>
> Finally, let's compare Georg's code to known competitors.
>
> alsaloop (part of alsa-utils): https://imgur.com/lvZLoli . Could not 
> settle after 300 seconds, so I am not uploading any processed plots.
>
> alsa_in (part of jack1): https://imgur.com/a/9PnW3 . Rather big 
> oscillations, definitely worse than what's proposed here.
>
> zita-ajbridge: could not run it, because it deadlocks at the start. 
> Could be the same as 
> https://bugs.launchpad.net/ubuntu/+source/zita-ajbridge/+bug/1368716
>
>
> Also, let's consider the limitations of what we can achieve with 
> different controllers.
>
> First, power-saving proponents will not let the latency be sampled 
> more often than once in 100 ms by default. Thus, we can only adjust 
> rates once in 100 ms. It is a PulseAudio limitation that rate is an 
> integer. If the correct rate is 44100.5 Hz, then a controller can only 
> set it to either 44100 or 44101 Hz. In any case, that's 1/88200 
> seconds per second of clock skew. Per our 100 ms period of sampling 
> the latency difference, that's 1.13 us. With a 1 kHz test signal (as 
> used in all examples above), that's about 0.4 degrees of phase. So, 
> "rate is an integer" is not an important limitation - the 
> theoretically achievable resolution is only slightly worse than the 
> natural clock instability of consumer soundcards.
>
>
> TL;DR: I'm in favor of merging up to PATCH 05/13, and, if someone else 
> reviews patches 06 and 07, would recommend merging up to PATCH 08/13. 
> I would recommend waiting two weeks before merging further patches, as 
> the deadband stuff is likely to get undone and redone if the rate 
> controller gets replaced. And I am going to experiment further with 
> different rate controllers on the next weekends.
>