[RFC PATCH] First draft implementation of better drain reporting

tanuk@xxxxxx (Tanu Kaskinen) · Sat, 02 Mar 2013 23:41:45 +0200

On Fri, 2013-03-01 at 16:05 +0100, David Henningsson wrote:
> First of all, this is an unfinished patch. At this point, it doesn't add any
> new "device underrun" events, only responds to drains more timely. It's quite
> tricky to get right (or I'm just dumb :-D ).

It's also quite tricky to review (or I'm just dumb) :)

> In short, the alsa-sink object first starts to keep track of how much has been
> played back (since last fill up). The sink is informed about this value, which it
> uses for two purposes:
>  1) If a stream is currently underrunning, it returns how much underrun the stream
>     actually is, so that alsa-sink can wake up at exactly this point in time.
>  2) If all of a stream has actually been played back, it calls into the stream's
>     new verify_underrun callback, and the stream's render data is emptied.
> 
> Things remaining are:
>  - First, I'm not sure the calculations are entirely correct. I think I shouldn't need
>    to add process_usec to the wakeup, but if I don't I wake up a few samples before the
>    stream has finished playback.
>  - This is implemented in mmap_write but should also be in unix_write
>  - A new device underrun callback for clients (requires protocol extensions)
>  - protocol-native's verify_underrun_cb contains copy-pasted code from pop_cb
>  - remove debug prints and maybe some other cleanup.
> 
> Signed-off-by: David Henningsson <david.henningsson at canonical.com>
> ---
>  src/modules/alsa/alsa-sink.c    |   41 +++++++++++++++++++++++++++++++--------
>  src/pulsecore/protocol-native.c |   28 ++++++++++++++++++++++++++
>  src/pulsecore/sink-input.c      |   33 +++++++++++++++++++++++++++++++
>  src/pulsecore/sink-input.h      |    6 ++++++
>  src/pulsecore/sink.c            |   25 ++++++++++++++++++++++++
>  src/pulsecore/sink.h            |    2 ++
>  6 files changed, 127 insertions(+), 8 deletions(-)
> 
> diff --git a/src/modules/alsa/alsa-sink.c b/src/modules/alsa/alsa-sink.c
> index 69d006e..59be22f 100644
> --- a/src/modules/alsa/alsa-sink.c
> +++ b/src/modules/alsa/alsa-sink.c
> @@ -146,6 +146,8 @@ struct userdata {
>      uint64_t since_start;
>      pa_usec_t smoother_interval;
>      pa_usec_t last_smoother_update;
> +    int64_t since_fill;
> +    int64_t expected_avail;
>  
>      pa_idxset *formats;
>  
> @@ -508,7 +510,7 @@ static size_t check_left_to_play(struct userdata *u, size_t n_bytes, pa_bool_t o
>  static int mmap_write(struct userdata *u, pa_usec_t *sleep_usec, pa_bool_t polled, pa_bool_t on_timeout) {
>      pa_bool_t work_done = FALSE;
>      pa_usec_t max_sleep_usec = 0, process_usec = 0;
> -    size_t left_to_play;
> +    size_t left_to_play, input_underrun;
>      unsigned j = 0;
>  
>      pa_assert(u);
> @@ -535,11 +537,14 @@ static int mmap_write(struct userdata *u, pa_usec_t *sleep_usec, pa_bool_t polle
>          }
>  
>          n_bytes = (size_t) n * u->frame_size;
> +        u->since_fill += n_bytes - u->expected_avail;

This appears to be the only place where expected_avail is used, and in
this context a lot of time has likely been passed since expected_avail
was last updated, so the actual expected empty space in the hw buffer is
usually much more than expected_avail. Therefore, the variable name is
misleading. I don't have any better ideas, though
(expected_avail_without_taking_the_passing_of_time_into_account is a bit
too verbose), so never mind...

>  
>  #ifdef DEBUG_TIMING
> -        pa_log_debug("avail: %lu", (unsigned long) n_bytes);
> +        pa_log_debug("avail: %lu, since fill: %ld, advance: %lu", (unsigned long) n_bytes,
> +                     (long) u->since_fill, (unsigned long) n_bytes - u->expected_avail);

n_bytes - u->expected_avail may result in a negative value (at least in
theory), but the format specifier is %lu. You probably meant to apply
the cast to the result of the substraction, not only to n_bytes. But
even then, a bogus value gets printed in case of a negative value. Maybe
that's OK, though, since a negative value can occur only in case of a
bug somewhere (in alsa or pulseaudio).

>  #endif
>  
> +        u->expected_avail = n_bytes;
>          left_to_play = check_left_to_play(u, n_bytes, on_timeout);
>          on_timeout = FALSE;
>  
> @@ -600,6 +605,7 @@ static int mmap_write(struct userdata *u, pa_usec_t *sleep_usec, pa_bool_t polle
>              const snd_pcm_channel_area_t *areas;
>              snd_pcm_uframes_t offset, frames;
>              snd_pcm_sframes_t sframes;
> +            size_t written;
>  
>              frames = (snd_pcm_uframes_t) (n_bytes / u->frame_size);
>  /*             pa_log_debug("%lu frames to write", (unsigned long) frames); */
> @@ -635,12 +641,14 @@ static int mmap_write(struct userdata *u, pa_usec_t *sleep_usec, pa_bool_t polle
>  
>              p = (uint8_t*) areas[0].addr + (offset * u->frame_size);
>  
> -            chunk.memblock = pa_memblock_new_fixed(u->core->mempool, p, frames * u->frame_size, TRUE);
> +            written = frames * u->frame_size;
> +            chunk.memblock = pa_memblock_new_fixed(u->core->mempool, p, written, TRUE);
>              chunk.length = pa_memblock_get_length(chunk.memblock);
>              chunk.index = 0;
>  
>              pa_sink_render_into_full(u->sink, &chunk);
>              pa_memblock_unref_fixed(chunk.memblock);
> +            u->since_fill = 0;

Shouldn't this be done only after a successful snd_pcm_mmap_commit()
call?

>  
>              if (PA_UNLIKELY((sframes = snd_pcm_mmap_commit(u->pcm_handle, offset, frames)) < 0)) {
>  
> @@ -655,21 +663,29 @@ static int mmap_write(struct userdata *u, pa_usec_t *sleep_usec, pa_bool_t polle
>  
>              work_done = TRUE;
>  
> -            u->write_count += frames * u->frame_size;
> -            u->since_start += frames * u->frame_size;
> +            u->write_count += written;
> +            u->since_start += written;
> +            u->expected_avail -= written;
>  
>  #ifdef DEBUG_TIMING
> -            pa_log_debug("Wrote %lu bytes (of possible %lu bytes)", (unsigned long) (frames * u->frame_size), (unsigned long) n_bytes);
> +            pa_log_debug("Wrote %lu bytes (of possible %lu bytes)", (unsigned long) written, (unsigned long) n_bytes);
>  #endif
>  
> -            if ((size_t) frames * u->frame_size >= n_bytes)
> +            if (written >= n_bytes)
>                  break;
>  
> -            n_bytes -= (size_t) frames * u->frame_size;
> +            n_bytes -= written;
>          }
>      }
>  
> +    input_underrun = pa_sink_find_underrun(u->sink, u->since_fill, left_to_play);

I believe left_to_play is not the correct variable to use here. It is
based on snd_pcm_avail(), but we're interested in the total latency, so
snd_pcm_delay() should be used.

Hmm, on the other hand, the new device underrun notifications should be
sent as soon as the input underrun "leaves" the alsa ring buffer. So,
the drain and underrun notifications should happen at different times.

> +
>      if (u->use_tsched) {
> +        pa_usec_t underrun_sleep = 0;
> +        if (input_underrun > 0)
> +             /* Wakeup to notify that stream is drained */
> +            underrun_sleep = pa_bytes_to_usec(left_to_play - input_underrun, &u->sink->sample_spec);
> +
>          *sleep_usec = pa_bytes_to_usec(left_to_play, &u->sink->sample_spec);
>          process_usec = pa_bytes_to_usec(u->tsched_watermark, &u->sink->sample_spec);
>  
> @@ -677,6 +693,11 @@ static int mmap_write(struct userdata *u, pa_usec_t *sleep_usec, pa_bool_t polle
>              *sleep_usec -= process_usec;
>          else
>              *sleep_usec = 0;
> +        if (underrun_sleep > 0) {
> +            underrun_sleep += process_usec; /* Add some margin. I haven't figured out why we need this :-( */

I think the margin should be a separate variable/constant. I don't think
there's any connection between process_usec and the margin that we need
here.

> +            *sleep_usec = PA_MIN(*sleep_usec, underrun_sleep);
> +        }
> +
>      } else
>          *sleep_usec = 0;
>  
> @@ -1099,6 +1120,8 @@ static int unsuspend(struct userdata *u) {
>      pa_smoother_reset(u->smoother, pa_rtclock_now(), TRUE);
>      u->smoother_interval = SMOOTHER_MIN_INTERVAL;
>      u->last_smoother_update = 0;
> +    u->since_fill = 0;
> +    u->expected_avail = u->hwbuf_size;

This should probably be done also when the sink is initialized.

>  
>      u->first = TRUE;
>      u->since_start = 0;
> @@ -1663,6 +1686,8 @@ static int process_rewind(struct userdata *u) {
>              pa_log_info("Tried rewind, but was apparently not possible.");
>          else {
>              u->write_count -= rewind_nbytes;
> +            u->since_fill -= rewind_nbytes;
> +            u->expected_avail += rewind_nbytes;
>              pa_log_debug("Rewound %lu bytes.", (unsigned long) rewind_nbytes);
>              pa_sink_process_rewind(u->sink, rewind_nbytes);
>  
> diff --git a/src/pulsecore/protocol-native.c b/src/pulsecore/protocol-native.c
> index a4ee920..0a39f2e 100644
> --- a/src/pulsecore/protocol-native.c
> +++ b/src/pulsecore/protocol-native.c
> @@ -231,6 +231,7 @@ enum {
>      CONNECTION_MESSAGE_REVOKE
>  };
>  
> +static pa_bool_t sink_input_verify_underrun_cb(pa_sink_input *i);

s/pa_bool_t/bool/, s/TRUE/true/ and s/FALSE/false/.

>  static int sink_input_pop_cb(pa_sink_input *i, size_t length, pa_memchunk *chunk);
>  static void sink_input_kill_cb(pa_sink_input *i);
>  static void sink_input_suspend_cb(pa_sink_input *i, pa_bool_t suspend);
> @@ -1171,6 +1172,7 @@ static playback_stream* playback_stream_new(
>  
>      s->sink_input->parent.process_msg = sink_input_process_msg;
>      s->sink_input->pop = sink_input_pop_cb;
> +    s->sink_input->verify_underrun = sink_input_verify_underrun_cb;
>      s->sink_input->process_rewind = sink_input_process_rewind_cb;
>      s->sink_input->update_max_rewind = sink_input_update_max_rewind_cb;
>      s->sink_input->update_max_request = sink_input_update_max_request_cb;
> @@ -1574,6 +1576,32 @@ static int sink_input_process_msg(pa_msgobject *o, int code, void *userdata, int
>  }
>  
>  /* Called from thread context */
> +static pa_bool_t sink_input_verify_underrun_cb(pa_sink_input *i) {
> +    playback_stream *s;
> +
> +    pa_sink_input_assert_ref(i);
> +    s = PLAYBACK_STREAM(i->userdata);
> +    playback_stream_assert_ref(s);
> +
> +    if (pa_memblockq_is_readable(s->memblockq))
> +        return FALSE;
> +
> +    pa_log_debug("%s of '%s'", s->drain_request ? "Finished playback" : "Real underrun", pa_strnull(pa_proplist_gets(i->proplist, PA_PROP_MEDIA_NAME)));
> +
> +   if (s->drain_request) {
> +        s->drain_request = FALSE;
> +        pa_log("IMPLEMENTOR: Sending PLAYBACK_STREAM_MESSAGE_DRAIN_ACK");
> +        pa_asyncmsgq_post(pa_thread_mq_get()->outq, PA_MSGOBJECT(s), PLAYBACK_STREAM_MESSAGE_DRAIN_ACK, PA_UINT_TO_PTR(s->drain_tag), 0, NULL, NULL);
> +    } else if (!s->is_underrun)
> +        pa_asyncmsgq_post(pa_thread_mq_get()->outq, PA_MSGOBJECT(s), PLAYBACK_STREAM_MESSAGE_UNDERFLOW, NULL, pa_memblockq_get_read_index(s->memblockq), NULL, NULL);
> +
> +    s->is_underrun = TRUE;
> +/*    playback_stream_request_bytes(s); */
> +    return TRUE;
> +}
> +
> +
> +/* Called from thread context */
>  static int sink_input_pop_cb(pa_sink_input *i, size_t nbytes, pa_memchunk *chunk) {
>      playback_stream *s;
>  
> diff --git a/src/pulsecore/sink-input.c b/src/pulsecore/sink-input.c
> index 519f47e..7968abd 100644
> --- a/src/pulsecore/sink-input.c
> +++ b/src/pulsecore/sink-input.c
> @@ -826,6 +826,18 @@ pa_usec_t pa_sink_input_get_latency(pa_sink_input *i, pa_usec_t *sink_latency) {
>      return r[0];
>  }
>  
> +/* Returns length of underrun, in sink bytes. Called from thread context */
> +uint64_t pa_sink_input_get_underrun_for_sink(pa_sink_input *i)
> +{
> +    uint64_t u = i->thread_info.underrun_for;
> +    if (u == (uint64_t) -1)
> +        return 0;
> +    if (!i->thread_info.resampler)
> +        return u;
> +    return pa_resampler_result(i->thread_info.resampler, u);
> +}
> +
> +
>  /* Called from thread context */
>  void pa_sink_input_peek(pa_sink_input *i, size_t slength /* in sink frames */, pa_memchunk *chunk, pa_cvolume *volume) {
>      pa_bool_t do_volume_adj_here, need_volume_factor_sink;
> @@ -1020,6 +1032,27 @@ void pa_sink_input_drop(pa_sink_input *i, size_t nbytes /* in sink sample spec *
>  }
>  
>  /* Called from thread context */
> +pa_bool_t pa_sink_input_verify_real_underrun(pa_sink_input *i, size_t nbytes /* in the sink's sample spec */) {

I'd rename nbytes to underrun_for. A comment can be added explaining why
it's not necessarily equal to i->thread_info.underrun_for.

> +    pa_sink_input_assert_ref(i);
> +    pa_sink_input_assert_io_context(i);
> +
> +    if (nbytes < pa_memblockq_get_maxrewind(i->thread_info.render_memblockq))
> +        return FALSE;
> +    if (pa_memblockq_is_readable(i->thread_info.render_memblockq))
> +        return FALSE;
> +
> +    if (!i->verify_underrun)
> +        return FALSE;
> +    if (i->verify_underrun(i)) {
> +        /* All valid data has been played back, so we can empty this queue. */
> +        pa_memblockq_silence(i->thread_info.render_memblockq);

Why is pa_memblockq_silence() needed?

> +        return TRUE;
> +    }
> +    return FALSE;
> +}
> +
> +
> +/* Called from thread context */
>  void pa_sink_input_process_rewind(pa_sink_input *i, size_t nbytes /* in sink sample spec */) {
>      size_t lbq;
>      pa_bool_t called = FALSE;
> diff --git a/src/pulsecore/sink-input.h b/src/pulsecore/sink-input.h
> index 8bd5912..097a9fb 100644
> --- a/src/pulsecore/sink-input.h
> +++ b/src/pulsecore/sink-input.h
> @@ -135,6 +135,9 @@ struct pa_sink_input {
>       * the full block. */
>      int (*pop) (pa_sink_input *i, size_t request_nbytes, pa_memchunk *chunk); /* may NOT be NULL */
>  
> +    /* TODO: Write something here */
> +    pa_bool_t (*verify_underrun) (pa_sink_input *i);
> +
>      /* Rewind the queue by the specified number of bytes. Called just
>       * before peek() if it is called at all. Only called if the sink
>       * input driver ever plans to call
> @@ -407,6 +410,9 @@ int pa_sink_input_process_msg(pa_msgobject *o, int code, void *userdata, int64_t
>  pa_usec_t pa_sink_input_set_requested_latency_within_thread(pa_sink_input *i, pa_usec_t usec);
>  
>  pa_bool_t pa_sink_input_safe_to_remove(pa_sink_input *i);
> +uint64_t pa_sink_input_get_underrun_for_sink(pa_sink_input *i);
> +pa_bool_t pa_sink_input_verify_real_underrun(pa_sink_input *i, size_t nbytes /* in the sink's sample spec */);
> +
>  
>  pa_memchunk* pa_sink_input_get_silence(pa_sink_input *i, pa_memchunk *ret);
>  
> diff --git a/src/pulsecore/sink.c b/src/pulsecore/sink.c
> index 6ebe956..977b555 100644
> --- a/src/pulsecore/sink.c
> +++ b/src/pulsecore/sink.c
> @@ -915,6 +915,31 @@ void pa_sink_move_all_fail(pa_queue *q) {
>      pa_queue_free(q, NULL);
>  }
>  
> + /* Called from IO thread context */
> +size_t pa_sink_find_underrun(pa_sink *s, size_t since_fill, size_t left_to_play) {
> +    pa_sink_input *i;
> +    void *state = NULL;
> +    size_t result = 0;
> +
> +    pa_sink_assert_ref(s);
> +    pa_sink_assert_io_context(s);
> +/*    pa_log("PLAYBACK: %d bytes, left to play %d, total %d", (int) since_fill, (int) left_to_play, (int) (since_fill + left_to_play)); */
> +    PA_HASHMAP_FOREACH(i, s->thread_info.inputs, state) {
> +        size_t uf = pa_sink_input_get_underrun_for_sink(i);

As a minor optimization and also for better readability, I think it
would be better to maintain a separate for_underrun_sink variable in
i->thread_info instead of having pa_sink_input_get_underrun_for_sink().

> +        if (uf == 0)
> +            continue;
> +        uf += since_fill;
> +        if (pa_sink_input_verify_real_underrun(i, uf))

This notifies an input about a real underrun. I think that's a bit
surprising side effect for pa_sink_find_underrun(), so I think it would
be clearer to have a separate function for doing those notifications,
called before pa_sink_find_underrun().

> +            continue;
> +        if (uf < left_to_play && uf > result)
> +            result = uf;
> +    }
> +
> +    if (result > 0)
> +        pa_log_debug("Found underrun %d bytes ago", (int) result);
> +    return result;
> +}
> +
>  /* Called from IO thread context */
>  void pa_sink_process_rewind(pa_sink *s, size_t nbytes) {
>      pa_sink_input *i;
> diff --git a/src/pulsecore/sink.h b/src/pulsecore/sink.h
> index fcda5ef..6510405 100644
> --- a/src/pulsecore/sink.h
> +++ b/src/pulsecore/sink.h
> @@ -492,6 +492,8 @@ void pa_sink_update_volume_and_mute(pa_sink *s);
>  
>  pa_bool_t pa_sink_volume_change_apply(pa_sink *s, pa_usec_t *usec_to_next);
>  
> +size_t pa_sink_find_underrun(pa_sink *s, size_t since_fill, size_t left_to_play);

pa_sink_find_next_input_underrun() might be a clearer name? I'd also
prefer the return value to be bytes to play before the underrun hits the
speakers, instead of returning how long it has been since the
render_memblockq became empty.

-- 
Tanu