Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

Jaroslav Kysela <perex@xxxxxxxx> · Wed, 14 Aug 2024 16:48:30 +0200

On 14. 08. 24 13:58, Pierre-Louis Bossart wrote:


On 8/14/24 13:12, Shengjiu Wang wrote:
On Wed, Aug 14, 2024 at 5:40 PM Pierre-Louis Bossart
<pierre-louis.bossart@xxxxxxxxxxxxxxx> wrote:


Yes, to go further, I think we can use SND_AUDIOCODEC_PCM, then
the SRC type will be dropped.

sounds good.

But my understanding of the control means the .set_metadata() API, right?
As I said, the output rate, output format, and ratio modifier are applied to
the instances of ASRC,  which is the snd_compr_stream in driver.
so only the .set_metadata() API can be used for these purposes.

Humm, this is more controversial.

The term 'metadata' really referred to known information present in
headers or additional ID3 tags and not in the compressed file itself.
The .set_metadata was assumed to be called ONCE before decoding.

But here you have a need to update the ratio modifier on a regular basis
to compensate for the drift. This isn't what this specific callback was
designed for. We could change and allow this callback to be used
multiple times, but then this could create problems for existing
implementations which cannot deal with modified metadata on the fly.

.set_metadata can be called multi times now, no need to change currently.

Not really, this set_metadata() callback is used only for gapless
transitions between tracks, see fcplay.c in tinycompress.

And now I am really confused because tinycompress uses an IOCTL directly:

	metadata.key = SNDRV_COMPRESS_ENCODER_PADDING;
	metadata.value[0] = mdata->encoder_padding;
	if (ioctl(compress->fd, SNDRV_COMPRESS_SET_METADATA, &metadata))

Whereas you want to use the ops callback directly from the control layer?

What would present a userspace program from using the ioctl directly
then? In that case, why do we need the control? I must be missing something.

The whole discussion is which place is more appropriate for the runtime 
controls (like the frequency shift). My opinion is, if we have a layer for 
this which can be used for presence of those controls and even range / type / 
notifications, we should use it.

The new/updated ioctls bounded only to active file descriptor does not allow 
to monitor those values outside.

And then there's the problem of defining a 'key' for the metadata. the
definition of the key is a u32, so there's plenty of space for different
implementations, but a collision is possible. We'd need an agreement on
how to allocate keys to different solutions without changing the header
file for every implementation.

Can we define a private space for each case?   For example the key larger
than 0x80000000 is private, each driver can define it by themself?

that would be a possibility indeed - provided that the opens above are
straightened out.

It sounds like we'd need a 'runtime params' callback - unless there's a
better trick to tie the control and compress layers?

I don't follow. If the compress driver code uses card/device/subdevice 
numbers, we can address the control properly. The problem is just that 
subdevice support in missing the current compress code / API.

For me, the compress_params.h changes may also require to pay attention to the 
encoding/decoding of the current compressed streams. So something like this 
may be more appropriate for the first step:

diff --git a/include/uapi/sound/compress_params.h 
b/include/uapi/sound/compress_params.h
index ddc77322d571..c664d15410eb 100644
--- a/include/uapi/sound/compress_params.h
+++ b/include/uapi/sound/compress_params.h
@@ -347,6 +347,8 @@ union snd_codec_options {
  * @modes: Supported modes. See SND_AUDIOMODE defines
  * @formats: Supported formats. See SND_AUDIOSTREAMFORMAT defines
  * @min_buffer: Minimum buffer size handled by codec implementation
+ * @pcm_formats: Output (for decoders) or input (for encoders)
+ *               PCM formats (required to accel mode, 0 for other modes)
  * @reserved: reserved for future use
  *
  * This structure provides a scalar value for profiles, modes and stream
@@ -370,7 +372,8 @@ struct snd_codec_desc {
        __u32 modes;
        __u32 formats;
        __u32 min_buffer;
-       __u32 reserved[15];
+       __u32 pcm_formats;
+       __u32 reserved[14];
 } __attribute__((packed, aligned(4)));

 /** struct snd_codec
@@ -395,6 +398,8 @@ struct snd_codec_desc {
  * @align: Block alignment in bytes of an audio sample.
  *             Only required for PCM or IEC formats.
  * @options: encoder-specific settings
+ * @pcm_format: Output (for decoders) or input (for encoders)
+ *               PCM formats (required to accel mode, 0 for other modes)
  * @reserved: reserved for future use
  */

@@ -411,7 +416,8 @@ struct snd_codec {
        __u32 format;
        __u32 align;
        union snd_codec_options options;
-       __u32 reserved[3];
+       __u32 pcm_format;
+       __u32 reserved[2];
 } __attribute__((packed, aligned(4)));

 #endif

Then the SRC extension may be like:

diff --git a/include/uapi/sound/compress_params.h 
b/include/uapi/sound/compress_params.h
index c664d15410eb..5d51ecba6d55 100644
--- a/include/uapi/sound/compress_params.h
+++ b/include/uapi/sound/compress_params.h
@@ -334,6 +334,14 @@ union snd_codec_options {
 	struct snd_dec_wma wma_d;
 	struct snd_dec_alac alac_d;
 	struct snd_dec_ape ape_d;
+	struct {
+		__u32 out_sample_rate;
+	} src_d;
+} __attribute__((packed, aligned(4)));
+
+struct snd_codec_desc_src {
+	__u32 out_sample_rate_min;
+	__u32 out_sample_rate_max;
 } __attribute__((packed, aligned(4)));

 /** struct snd_codec_desc - description of codec capabilities
@@ -349,6 +357,7 @@ union snd_codec_options {
  * @min_buffer: Minimum buffer size handled by codec implementation
  * @pcm_formats: Output (for decoders) or input (for encoders)
  *               PCM formats (required to accel mode, 0 for other modes)
+ * @u_space: union space (for codec dependent date)
  * @reserved: reserved for future use
  *
  * This structure provides a scalar value for profiles, modes and stream
@@ -373,7 +382,11 @@ struct snd_codec_desc {
 	__u32 formats;
 	__u32 min_buffer;
 	__u32 pcm_formats;
-	__u32 reserved[14];
+	union {
+		__u32 u_space[6];
+		struct snd_codec_desc_src src;
+	} __attribute__((packed, aligned(4)));
+	__u32 reserved[8];
 } __attribute__((packed, aligned(4)));

 /** struct snd_codec

This will allow to handshake the output rate between user space and kernel 
driver. Eventually we can use a rate bitmap to be more precise in "struct 
snd_codec_desc_src" (or combination of range/bitmap).

						Jaroslav

--
Jaroslav Kysela <perex@xxxxxxxx>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.