source-sink loopback

lennart@xxxxxxxxxxxxxx (Lennart Poettering) · Thu, 20 Aug 2009 02:06:09 +0200

On Mon, 17.08.09 11:19, pl bossart (bossart.nospam at gmail.com) wrote:

> Howdy,

Heya,

> Since this was missing in PulseAudio, I created a loopback module
> where input sound can be redirected to a sink. That removes the need
> for the silly parec|pacat workaround.

Nice!

> Basically the module creates both a sink_input and source_output. When
> there's data available from the source, a push callback is called by
> the IO thead. Likewise when the sink has eaten the data a pop callback
> is invoked. Since both callbacks are called from two different
> threads, I pushed incoming data into a memblockq and read from there
> in the pop callback. I figured it this would enable zero-copy
> operations. Works fine with a USB mic and PC internal speakers, and
> vice-versa. With a low-enough latency you can use it for karaoke apps,
> but that was not the initial goal...

Yes, this is basically how this should be done.

> Since this was the first time I really looked into PulseAudio
> internals, I have a number of questions:
> 0. is there a better way to do this loopback? I did not find any
> module that relied on the push callback, and I only saw the pop
> callback in very limited number of modules.

Implementing pa_sink::pop() and pa_source::push() is the correct
way. All modules that implement sinks/sources do it this way.

> 1. latency: I configured both sink and source to the same latency, yet
> in the push callback I see the number of samples can vary. If I
> program the source with a 10ms latency, I would expect to get a 10ms
> buffer. Is this due to some kind of scheduling thingy?

This really depends. Generally no code in PA should blindly assume it
gets what it asked for. memblock sizes can vary due to various reasons
(resampling, mixing, missed deadlines). And latencies due to that
too. Also, the configured latency always depends on what the clients
which are connected requested, and when that changes this also has an
effect on the blocks the clients get, because every client gets the
same. And then, if the latency is changed from a high value to a low
value this often causes means for recording that we need to process
the previously much larger queued data first. Finally, ALSA offers an
API to choose buffers sizes, but no API to choose the overall
latency. That means for example for outputs like USB that if we ask
for a specific latency by configuring the buffer size accordingly the
resulting measured latency will be higher since we didn't know in
advance what kind of extra latency the hardware adds after the
playback buffer (which is a non-trivial amount on USB).

So, basically: if you configure a latency then you should get
something in the area what you asked for but we cannot make
guarantees. And the connection between latency and block sizes is even
fuzzier.

> Likewise when I configure the source to have a 20ms latency, I get a
> buffer of ~10ms. Is this normal?

Hmm. I guess. Really depends on the case. If you have a 20ms buffer
then simply due to wrap around the block sizes might get smaller than
this.

> 2. I assumed that the memblockq routines (push, peek and drop) are
> thread-safe, is this a valid assumption?

Nope. You may not assume that.

Only very few functions in PA are thread-safe. This has various
reasons: speed, simplicity, fear of deadlock hell, but most
importantly that we try to minimize locking. The goal is to do things
entirely lock-free.

To fix this I'd suggest allocating a pa_asyncmsgq object for sending
over the memblocks from the source thread to the sink thread. You can
send arbitrary data with that including memchunks. It's thread-safe
(and lock-free). Then, on the receiver side push the data into a
pa_memblockq for flexible buffering.

> 3. For now the source and sink are synchronous but if they are not,
> how can I enable a sample-rate converter to correct for clock drifts?
> I see some code for SRC in both the input and output IO threads,
> however I don't understand how the tracking would be done.

module-combine handles this already. It probably would make sense to
copy the basic logic here: in the main thread simply measure the
latency of the sink and source every now and then, and then update the
sampling rate of the sink input with pa_sink_input_set_rate().

(This is actually quite hard to get right, and module-combine doesn't
entirely get it right. The problem is getting a somewhat atomic
snapshot of both latencies, since in the time between asking the two
latencies another memblock might have been sent over.)

> 4. someone described a use case with BT, I would need to load the same
> module twice, i.e.
> load-module module-loopback source=bt-src sink=speakers
> load-module module-loopback source=mic sink=bt-sink
> Is there anything specific I need to do for this?

As long as you place all your stuff in the m->userdata field insted of
static variables you should be safe loading your module as many times
as you wish. 

> 5. I started from the module-sine code, however I have no idea what to
> do about some callbacks, specifically rewind and state-changed. Does
> anyone have a description of what's expected here?

Errks. Documentation! ;-)

They are very tersly documented in the header files. And in the wiki
there are some docs too. But I guess otherwise one needs to read the
sources and ask Lennart... 

In the rewind callback you you simply must rewind the read pointer in
the memblockq. It is called whenever we need to rewrite the hardware
playback buffer. i.e. let's say we have 2s of buffer. Now a new stream
is added to the mix. We need to remix the whole 2s we already
wrote. Then we rewind each stream and ask for the data again and write
it to the buffer.

If you use a memblockq all you need to do is basically forward this
call to pa_memblockq_rewind() which does the heavy lifting for you.

Whether you need to implement a state-changed cb depends. module-sine
uses it to trigger a rewind when the stream is created because it has
PCM data ready right-away. So it listens for the
PA_SINK_INPUT_INIT->PA_SINK_INPUT_RUNNING state change and requests
the rewind right away. In other modules however PCM data might not be
readily available, i.e. because it needs to be received first from a
client. In that case you probably don't want to rewind right-away on
that state change but instead wait until you actually got enough PCM
data and only then request the rewind. Your case is the latter I
guess.

Hope this helps!

I might be able to look into this during the next days myselves and
fix the remaining issues!

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net
http://0pointer.net/lennart/           GnuPG 0x1A015CC4