Re: Handling complex matrix mixers in ALSA

Asahi Lina <lina@xxxxxxxxxxxxx> · Fri, 12 Jul 2024 18:48:09 +0900

On 7/2/24 9:46 AM, Takashi Sakamoto wrote:
> On Mon, Jul 01, 2024 at 04:17:11PM +0200, Takashi Iwai wrote:
>> As Geoffrey already suggested, the matrix size can be reduced since
>> each kcontrol element can be an array, so the number of controls can
>> be either column or row size of the matrix, which is well manageable.
>  
> Additionally, a snd_kcontrol structure can provide multiple control
> elements by its 'count' member. I call it 'control element set'. It can
> reduce allocation in kernel space. If the hardware in this case provides
> software interface to access to all source coefficients to one
> destination at once, it is suitable in the case.
The hardware interface is basically a command stream of "set" commands
for each individual mixer node, so there is no particular
dimension/grouping that is more optimal. You can send up to 64~128
arbitrary updates per 512-byte USB bulk packet (depending on whether you
have to allocate nodes or just update existing nodes).

Due to the max 2048 active node limitation, the driver is going to have
to process/diff bulk updates anyway regardless of the way we group them
(I will have to scan the new values, first update any zero values to
free up nodes, then update nonzero values, and bail with an error before
sending the update to the hardware if we run out of nodes). Hopefully
this will never happen in real-life scenarios, but it does mean that
some matrix mixer configurations are "illegal" and will be refused by
the driver, due to hardware limitations.

> For example, assuming the matrix mixer has 34 destination and 66
> sources, they can be expressed by 34 control elements with 66 array
> elements. A single snd_kcontrol structure can provide them, as long as
> they have the same nature. The control elements are identified by index
> value.

It took me a while to understand what you meant here, but I think I get
it: Using a single snd_kcontrol for the entire mixer, with 34 indexed
elements each taking 66 array values, right?

How do these kinds of controls show up in alsamixer and other userspace
mixer tools? Are they usable at all, or just with low-level access via
amixer/alsactl?

> Once I talked with Philippe Bekaert about the issue, then we found another
> issue about the way to distinguish both each control elements and the array
> members. The usage of ALSA control interface heavily relies on the name of
> control elements, while a single snd_kcontrol structure provides one name
> and all of the controls provided by it have the same name. We've
> investigated to use TLV (Type-Length-Array) function of ALSA control core
> to provide channel information about the sources and destinations, but no
> further work yet[1].

Yeah, that's another issue... these interfaces have fairly homogeneous
channels for the mixer, so it's not a big deal, but if we want to settle
on any particular standard for matrix mixers we're going to need some
way to inform userspace of what the numbered sources/destinations mean...

> I think it better to have another care that in this case we have restriction
> for the size of array; e.g. 128 array elements for integer (long) type of
> value. The restriction is not the matter in your case.

A related device (the MADIface USB, that I don't own but I can probably
support with some traces from someone who does) has, if I'm
understanding the manual correctly, up to a 192x128 mixer mode (128
hardware inputs + 64 playback inputs per output mix), so that would
exceed the maximum number of array elements. I could split each output
submix control into multiple array controls per input group to keep the
count under 128 for each, but that starts getting a bit weird and
arbitrary I think...

> 
>> The VU meter can be provided as volatile read-only controls, too.
>>
>> So from the API-wise POV, it'll be a most straightforward
>> implementation.
> 
> As a side note, the design of software interface for recent hardware
> requires floating point values for this kind of data, while it is not
> supported in ALSA control core and its userspace interface.

I don't know how the VU meter data works yet, but there's another issue
here with the mixer controls. This device uses a 1-bit scale selector
and a 14-bit value, basically a trivial floating point format with a
1-bit exponent. There is also a sign bit (can invert the phase of any
mixer node).

Effectively I have a linear gain between -0x10000 and 0x10000 where
0x8000 is 0dB and -0x8000 is 0dB(inverted), but for values outside of
the -0x3fff..0x3fff range, it loses 3 bits of LSB precision.

The inversion isn't really representable in a single control, right? So
I'd have to have a whole separate boolean matrix control set for the
sign bits, I think?

And then I need to figure out how to scale the values... if I use the
full range of 0..0x10000 then I can use DECLARE_TLV_DB_LINEAR to declare
how it maps to dB, and then I guess I would just have to truncate the
values sent to the hardware (when in the coarse range) but always keep
track of the control value in the driver, so userspace doesn't get
confused with "impossible" values? Of course that would be "lying" about
the hardware precision, since 8 possible values would map to the same
hardware volume at higher scales...

>From reading the alsa-lib code, I think I need to use TLV_DB_GAIN_MUTE
as the min gain and then that turns it into a linear control where 0 = mute?

I'm also a bit worried about being able to accurately represent
+0dB/unity (for bit-perfect passthrough), since ALSA uses 1/100th of a
dB as scale for the TLVs (I think?). x2 gain is +6.02(...) dB, but that
isn't enough significant digits to precisely map 0x8000 to 0dB:

>>> hex(int(0x10000 * (10**(-6.02 / 20))))
'0x8002'
>>> hex(int(0x10000 * (10**(-6.03 / 20))))
'0x7fdc'

So userspace that tries to use the TLV scale data to set "0dB" using the
simple API is unlikely to actually get precisely 0dB.

(This is also a problem for the output fader controls unrelated to the
matrix mixer, since they use the exact same scale encoding with inversion)

Maybe the driver should just use an arbitrary log dB TLV scale for the
controls and internally map to the hardware values? That means doing the
conversions internally and lying about precision at the lower end but at
least it would avoid rounding error when setting a gain to 0dB...

> 
>> OTOH, if you need more efficiency (e.g. the control access is way too
>> much overhead), it can be implemented freely via a hwdep device and
>> your own ioctls or mmaps, too.  But this is literally h/w dependent,
>> and the API becomes too specific, no other way than using own tool, as
>> a drawback.
> 
> [1] https://github.com/PhilippeBekaert/snd-hdspe/issues/13

There's one more thing I'd like to ask. Would it be useful for me to
submit just the streaming part of the device support upstream first
(which would work with the userspace app for config) and then worry
about designing the mixer control interface later? Or should it all be
done in one submission?

~~ Lina