Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Fri, Aug 9, 2024 at 3:25 PM Pierre-Louis Bossart
<pierre-louis.bossart@xxxxxxxxxxxxxxx> wrote:
>
>
> >>>> Then there's the issue of parameters, we chose to only add parameters
> >>>> for standard encoders/decoders. Post-processing is highly specific and
> >>>> the parameter definitions varies from one implementation to another -
> >>>> and usually parameters are handled in an opaque way with binary
> >>>> controls. This is best handled with a UUID that needs to be known only
> >>>> to applications and low-level firmware/hardware, the kernel code should
> >>>> not have to be modified for each and every processing and to add new
> >>>> parameters. It just does not scale and it's unmaintainable.
> >>>>
> >>>> At the very least if you really want to use this compress API,
> >>>> extend it
> >>>> to use a non-descript "UUID-defined" type and an opaque set of
> >>>> parameters with this UUID passed in a header.
> >>>
> >>> We don't need to use UUID-defined scheme for simple (A)SRC
> >>> implementation. As I noted, the specific runtime controls may use
> >>> existing ALSA control API.
> >>
> >> "Simple (A)SRC" is an oxymoron. There are multiple ways to define the
> >> performance, and how the drift estimator is handled. There's nothing
> >> simple if you look under the hood. The SOF implementation has for
> >> example those parameters:
> >>
> >> uint32_t source_rate;           /**< Define fixed source rate or */
> >>                 /**< use 0 to indicate need to get */
> >>                 /**< the rate from stream */
> >> uint32_t sink_rate;             /**< Define fixed sink rate or */
> >>                 /**< use 0 to indicate need to get */
> >>                 /**< the rate from stream */
> >> uint32_t asynchronous_mode;     /**< synchronous 0, asynchronous 1 */
> >>                 /**< When 1 the ASRC tracks and */
> >>                 /**< compensates for drift. */
> >> uint32_t operation_mode;        /**< push 0, pull 1, In push mode the */
> >>                 /**< ASRC consumes a defined number */
> >>                 /**< of frames at input, with varying */
> >>                 /**< number of frames at output. */
> >>                 /**< In pull mode the ASRC outputs */
> >>                 /**< a defined number of frames while */
> >>                 /**< number of input frames varies. */
> >>
> >> They are clearly different from what is suggested above with a 'ratio-
> >> mod'.
> >
> > I don't think so. The proposed (A)SRC for compress-accel is just one
> > case for the above configs where the input is known and output is
> > controlled by the requested rate. The I/O mechanism is abstracted enough
> > in this case and the driver/hardware/firmware must follow it.
>
> ASRC is usually added when the nominal rates are known but the clock
> sources differ and the drift needs to be estimated at run-time and the
> coefficients or interpolation modified dynamically
>
> If the ratio is known exactly and there's no clock drift, then it's a
> different problem where the filter coefficients are constant.
>
> >> Same if you have a 'simple EQ'. there are dozens of ways to implement
> >> the functionality with FIR, IIR or a combination of the two, and
> >> multiple bands.
> >>
> >> The point is that you have to think upfront about a generic way to pass
> >> parameters. We didn't have to do it for encoders/decoders because we
> >> only catered to well-documented standard solutions only. By choosing to
> >> support PCM processing, a new can of worms is now open.
> >>
> >> I repeat: please do not make the mistake of listing all processing with
> >> an enum and a new structure for parameters every time someone needs a
> >> specific transform in their pipeline. We made that mistake with SOF and
> >> had to backtrack rather quickly. The only way to scale is an identifier
> >> that is NOT included in the kernel code but is known to higher and
> >> lower-levels only.
> >
> > There are two ways - black box (UUID - as you suggested) - or well
> > defined purpose (abstraction). For your example 'simple EQ', the
> > parameters should be the band (frequency range) volume values. It's
> > abstract and the real filters (resp. implementation) used behind may
> > depend on the hardware/driver capabilities.
>
> Indeed there is a possibility that the parameters are high-level, but
> that would require firmware or hardware to be able to generate actual
> coefficients from those parameters. That usually requires some advanced
> math which isn't necessarily obvious to implement with fixed-point hardware.
>
> > From my view, the really special cases may be handled as black box, but
> > others like (A)SRC should follow some well-defined abstraction IMHO to
> > not force user space to handle all special cases.
>
> I am not against the high-level abstractions, e.g. along the lines of
> what Android defined:
> https://developer.android.com/reference/android/media/audiofx/AudioEffect
>
> That's not sufficient however, we also need to make sure there's an
> ability to provide pre-computed coefficients in an opaque manner for
> processing that doesn't fit in the well-defined cases. In practice there
> are very few 3rd party IP that fits in well-defined cases, everyone has
> secret-sauce parameters and options.

Appreciate the discussion.

Let me explain the reason for the change:

Why I use the metadata ioctl is because the ALSA controls are binding
to the sound card.  What I want is the controls can be bound to
snd_compr_stream, because the ASRC compress sound card can
support multi instances ( the ASRC can support multi conversion in
parallel).   The ALSA controls can't be used for this case,  the only
choice in current compress API is metadata ioctl. And metadata
ioctl can be called many times which can meet the ratio modifier
requirement (ratio may be drift on the fly)

And compress API uses codec as the unit for capability query and
parameter setting,  So I think need to define "SND_AUDIOCODEC_SRC'
and 'struct snd_dec_src',  for the 'snd_dec_src' just defined output
format and output rate, channels definition just reuse the snd_codec.ch_in.

I understand your concern, but there seems no better option.
If you have, please guide me. Thanks.

Best regards
Shengjiu Wang





[Index of Archives]     [Pulseaudio]     [Linux Audio Users]     [ALSA Devel]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]

  Powered by Linux