[PATCH 03/11] resampler: Change interface, resampler may return the number of leftover frames

tanu.kaskinen@xxxxxxxxxxxxxxx (Tanu Kaskinen) · Mon, 11 Nov 2013 14:46:29 +0200

On Sat, 2013-11-09 at 02:34 +0600, Alexander E. Patrakov wrote:
> Peter Meerwald wrote:
> 
> > some resampler implementations (e.g. libsamplerate and ffmpeg) do not consume
> > the entire input buffer; the impl_resample() function now has a return value
> > returning the number of frames in the input buffer not processed
> > these frames must be saved in appropriate buffer and presented together with
> > new input data
> 
> (yes I know this is too late, as the patch is already pushed, but...)
> 
> First, I slightly object to the commit message. It may be
> misinterpreted as "they don't even look at the entire input buffer".
> While sometimes this is true (e.g. because the output buffer size was
> a limiting factor), there is another important reason why some samples
> need to be presented again. Namely, this happens if they affect both
> the old and the new chunk of output data and the resampler
> implementation relies on the user to housekeep the input buffers.
> 
> Second, I consider all resamplers that don't do this (i.e. that handle
> the necessary buffering internally in an opaque manner) as
> fundamentally broken if rewinds are allowed, and would argue for their
> refactoring or deletion. And here is why (yes, I am a perfectionist in
> the bad sense of this word).

Outright deletion is an option only if we think that the deleted
resampler is worthless or nearly worthless to us. Imperfect rewind
handling alone doesn't make a resampler worthless.

> When a rewind is requested on a sink input, currently pulseaudio
> completely resets the resampler. So, even if the same data is written
> again, the internal state is lost. This results in an audible click.
> 
> What's even worse, the exact correspondence between the input and
> output sample _count_ is, strictly speaking, lost due to these resets.
> Consider (just for illustration) a simple 1.5x downsampler that just
> copies input samples to output, dropping every third sample. Its
> internal state is just the input sample index modulo 3. By
> systematically rewinding it often enough (while still guaranteeing
> progress), we can cause it to drop less or more samples than it would
> during its normal operation.
> 
> Now, what could pulseaudio do in order to avoid this? A dumb answer
> is: it could save the internal state of the resampler to its own
> buffer and restore it. How often should it do that? A fail-safe answer
> is: "for every input sample", but that's expensive. We could to that
> for every N samples, and correspondingly round the rewind amount up.
> That's definitely possible, and that's the best what pulseaudio can do
> if it treats a resampler as some opaque object. Except that the
> resamplers don't provide any API for saving and restoring the state,
> which is fixable by importing them into pulseaudio tree (as we did
> with the ffmpeg resampler) and adding the corresponding functions.
> 
> The point is that we can do even better, because the internal state of
> a resampler is not an opaque object.
> 
> Here is what the resampler state actually contains for real-world resamplers.
> 
> FFMpeg uses this:
> 
> typedef struct AVResampleContext{
>     FELEM *filter_bank;
>     int filter_length;
>     int ideal_dst_incr;
>     int dst_incr;
>     int index;
>     int frac;
>     int src_incr;
>     int compensation_distance;
>     int phase_shift;
>     int phase_mask;
>     int linear;
> }AVResampleContext;
> 
> Only the following fields are written to during the normal operation
> (i.e. after initialization): frac, index, dst_incr,
> compensation_distance. And compensation_distance is unused (always 0)
> in pulseaudio, because av_resample_compensate() is never called. For
> the same reason, dst_incr is always equal to in_rate * (1 <<
> phase_shift). So, only index and frac remain. Look, no sample buffer!
> 
> Now let's consider how these fields are updated. It turns out that
> their increments do not depend on the actual samples and thus are, in
> theory, computable from the number of processed and/or produced
> samples only. So here is one very important result: ffmpeg resampler
> is essentially stateless, at the cost of asking for some input samples
> twice. Its so called "internal state" can be regenerated from scratch
> after any rewind, if one does the math, i.e. there is no need to save
> and restore anything. So, ffmpeg resampler is a promising candidate
> for clickless rewinds, as it is almost ready. Here is the list of what
> remains to be done:
> 
>  * A closed-form expression for index and frac just after producing
> Nth output sample.
>  * A closed-form expression for the index of the first and the last
> input sample affecting Nth output sample given the current filter
> length.
>  * A closed-form expression for the index of the first and the last
> output sample affected by Mth input sample.
>  * A new uint64_t field in the sink structure that counts the samples
> produced by the sink.
>  * More accurate calculation of the sink input rewind amount in
> pa_sink_input_process_rewind, instead of the inaccurate generic
> pa_resampler_request() function that tries to second-guess behind the
> resampler implementation without knowing about the filter length and
> with disregarding the fact that the "leftover" data wobbles in length
> slightly during the normal operation.
>  * More accurate conversion back into the sink domain, instead of the
> inaccurate pa_resampler_result() function.
>  * Regeneration of index and frac from the sample counter instead of
> resetting them.
> 
> Other resamplers contain essectially the same fields in their state
> structrue, plus possibly a buffer that contains either a copy of some
> input samples for looking again (not a problem, we can get the same
> from a render_memblockq), or a buffer for incompletely-accumulated
> output samples (a serious problem).
> 
> The list above is a bit oversimplified and assumes that we kill
> everything except the ffmpeg resampler, or except any other
> essentially-stateless resampler. It may be indeed a good idea to
> either extend the ffmpeg resampler for variable rates and kill all
> others, or adapt the resampler implementation from the Wine project -
> it is written by me, GPLed (but I would be happy to relicense if
> needed), stateless in the same sense as ffmpeg's, and supports
> variable sample rate.

I don't understand why making e.g. the ffmpeg resampler handle rewinds
correctly would necessitate the deletion of all resamplers that don't
handle rewinds correctly. That oddity notwithstanding, it sounds like
you have a solid plan for fixing the rewinds at least for a subset of
our resamplers, so can we expect patches coming at some point?

-- 
Tanu