[PATCH v2] bluez5-device: Rewrite of thread function, reduce send buffer size for a2dp sink

georg@xxxxxxxx (Georg Chini) · Sun, 25 Mar 2018 09:44:41 +0200

On 25.03.2018 08:34, Tanu Kaskinen wrote:
> On Sat, 2018-03-24 at 19:27 +0100, Georg Chini wrote:
>> On 24.03.2018 18:13, Tanu Kaskinen wrote:
>>> On Mon, 2018-03-05 at 08:49 +0100, Georg Chini wrote:
>>>> The rewrite of the thread function does not change functionality much,
>>>> most of it is only cleanup, minor bug fixing  and documentation work.
>>>>
>>>> This patch also changes the send buffer size for a2dp sink to avoid lags
>>>> after temporary connection drops, following the proof-of-concept patch
>>>> posted by Dmitry Kalyanov.
>>>>
>>>> Bug-Link: https://bugs.freedesktop.org/show_bug.cgi?id=58746
>>>> ---
>>>>    src/modules/bluetooth/module-bluez5-device.c | 275 ++++++++++++++++++---------
>>>>    1 file changed, 182 insertions(+), 93 deletions(-)
>>> Thanks! Reading the new code caused less trouble than I recall the
>>> bluetooth thread_func() previously having caused, so the changes are
>>> good.
>>>
>>> There are some issues pointed out below.
>>>> +    }
>>>>    
>>>>        u->rtpoll_item = pa_rtpoll_item_new(u->rtpoll, PA_RTPOLL_NEVER, 1);
>>>>        pollfd = pa_rtpoll_item_get_pollfd(u->rtpoll_item, NULL);
>>>> @@ -861,7 +903,7 @@ static void setup_stream(struct userdata *u) {
>>>>        pollfd->events = pollfd->revents = 0;
>>>>    
>>>>        u->read_index = u->write_index = 0;
>>>> -    u->started_at = 0;
>>>> +    u->started_at = pa_rtclock_now();
>>> This seems unnecessary. write_block() resets started_at anyway when the
>>> first write is done.
>>>
>>> Hmm... Now I noticed that u->started_at is used before the first write.
>>> But it seems wrong to set u->started_at twice using (slightly)
>>> different values. u->started_at is used in this code before the first
>>> write:
>>>
>>>       time_passed = pa_rtclock_now() - u->started_at;
>>>
>>> I think before the first write time_passed should be set to exactly 0.
>> We have to discard audio that accumulated during the startup
>> time of the device to make sure audio is in sync. There will always
>> be some bytes dropped at the start. So it is correct to set the
>> initial time twice.
> Dropped? I don't see how that happens. In the first iteration
> time_passed is some small value, likely less than one block. audio_sent
> is zero. Skipping is done only if (time_passed - audio_sent) is more
> than two blocks.

It definitely is more than two blocks.

>
> If the delay is large for some reason, then skipping might happen, but
> that's not good. A long startup delay shouldn't cause audio to be
> dropped.

Why shouldn't it? If audio accumulated, it needs to be dropped
to achieve initial sync.

> AV sync only requires accurate latency reporting, and the GET_LATENCY
> handler uses u->started_at in its calculations. The delay between
> setup_stream() where started_at is first set and the first
> write_block() call doesn't affect the latency at all (and if it did,
> GET_LATENCY would not take that latency into account). Consider this
> thought experiment: if the startup delay is 10 seconds for some reason,
> surely the end latency is still much less 10 seconds (even without
> dropping any audio).

Don't understand. If the startup delay is 10 seconds and audio is 
accumulating
during that time, the delay will always be 10 seconds. How should it 
ever change?
So without dropping (and without resetting start time) A/V Sync does not 
work.

>
>>>>    
>>>> +                    /* A new block needs to be sent. */
>>>>                        if (audio_sent <= time_passed) {
>>>> -                        pa_usec_t audio_to_send = time_passed - audio_sent;
>>>> +                        size_t bytes_to_send = pa_usec_to_bytes(time_passed - audio_sent, &u->sample_spec);
>>>>    
>>>> -                        /* Never try to catch up for more than 100ms */
>>>> -                        if (u->write_index > 0 && audio_to_send > MAX_PLAYBACK_CATCH_UP_USEC) {
>>>> -                            pa_usec_t skip_usec;
>>>> +                        /* There are more than two blocks that need to be written.
>>>> +                         * We cannot catch up, therefore discard everything older
>>>> +                         * than two block sizes. */
>>> It seems that "cannot catch up" is not always true. It's probably true
>>> that usually the buffer has room for only two blocks, but it's possible
>>> that the buffer size is large enough for more blocks. Suggested
>>> wording:
>>>
>>> "There are more than two blocks that need to be written. Usually the
>>> socket buffer has room for only two blocks, but even if there's more
>>> room, let's not bother trying to catch up more than two blocks. We'll
>>> discard everything older than two block sizes."
>> "Catch up" means catch up with the accumulated audio. We cannot play
>> faster, so everything we do not discard here will cause a permanent delay.
>> This has nothing to do with buffer size.
> Ok. Since I misunderstood the comment, maybe make it more elaborate:
>
> "There are more than two blocks that need to be written. It seems that
> the socket has not been accepting data fast enough (could be due to
> hiccups in the wireless transmission). We need to discard everything
> older than two block sizes to keep the latency from growing."

OK.

>
>>> + if ((result = write_block(u)) < 0)
>>>>                                goto fail;
>>>> -                    }
>>>> -
>>>> -                    if (n_written == 0)
>>>> -                        pa_log("Broken kernel: we got EAGAIN on write() after POLLOUT!");
>>>>    
>>>> -                    do_write -= n_written;
>>>> -                    writable = false;
>>>> -                }
>>>> +                        blocks_to_write -= result;
>>>> +                        writable = false;
>>>> +                        have_written = true;
>>> Shouldn't have_written be set to true only if result > 0? If we got
>>> EAGAIN, then no write actually happened.
>>>
>> It needs to be set anyway, because we will try to send the same block
>> again on the next iteration if we got EAGAIN. If have_written is false, the
>> thread will not be woken up by POLLOUT.
> Why will it not be woken up? writable will still be set to false, so we
>   do request wakeup on POLLOUT:
>
>      /* Set events to wake up the thread */
>      pollfd->events = (short) (((have_sink && !writable) ? POLLOUT : 0) | (have_source ? POLLIN : 0));
>
> If we set have_written to true and the socket stays unwritable, then we
>   don't set up the timer and hence don't drop audio every 500 ms as
> would normally be the case.
>
You are right. I'll test it.