About the DolbyA thing...
(Please ignore the continual run-on sentences!!! My composition skills suck.)
The magic in DolbyA is about how the attack/release conforms to the waveform.
If you want to emulate a DolbyA unit, and it's *really good* behavior, look carefully at the attack/release circuit.
The general compressor design as in R. Dolby's patents does produce GOOD results for general use (beyond DolbyA applications.) I have played with reasonably precise SW emulators for GP compression, and they seem to work very cleanly. My current project UTILIZES the DolbyA emulation, but the emulator works well enough for now. The current 'challenge' is the dispersive compressor/expander that is also part of the project.
Referring to the Dolby compressor circuit, those diodes in the circuit are NOT simple rectifiers, but the I/V diode log curves are a critical part of the proper behavior. Early on,In an attempt to 'improve' the design, I designed all kinds of poorly working 'superior' detectors with otherwise similar attack/release, with pretty terrible results decoding real DolbyA materials. This need for compliance is especially important below about 600Hz so that the waveform and envelope contours are precisely the same for encoding/decoding. If the attack/release behavior below about 600Hz isn't correct, then there will be distortion in the audio. (The attack/release times on DolbyA generally comply with the envelope 600Hz, and waveform sensitive below that frequency, but especially below the 100Hz range.) Also, the 'compliant' attack/release is layered in two phases. Incorrect attack times are especially egregious with serious amounts of harshness if too fast. There is also a propagation delay between the 3k-20k+ range and the 9k-20k+ range compressors/expanders (because of the competitive feedback scheme.) With an incorrect propagation though the software compressor emulator, the HF edges can be obliterated or overly enhanced.
I have a decoder that has been iteratively optimized over the years, and has recently become fairly good at decoding ripped DolbyA materials. Until the software closely emulated the DolbyA HW behavior, including the FET circuit VGA, then the results were much worse. The software decoder source is ugly because I never cared about sanitizing it, however I do use it every day as a part of another application. The decoder is comprised of a fitted curve for the FET circuit gain, and an iteratively tweaked emulation of the diode curves.
I do suggest that a compressor with general DolbyA characteristics might be reasonable to do with SoX. Truly encoding/decoding DolbyA materials is a totally different world.
(Also, the Q values for the band splitter are approx 0.42 for 8.8kHz, 0.45 for 2.7kHz and apprx 1 or 1.2 for somewhere around 75-80Hz?? (I forget the exact numbers). I emulated the bandpasses with a 'tweaked' set of FIR filters, and unfolded the compressors into actual expanders (along with a bit of other math tweaks.)
Doing it again, I would have implemented my decoder in a slightly more clean way and done a little more professional job of writing it.
Currently, the decoder is not normally used all by itself, but is used in an array of decoders cobbled together to created a wider range expander.
Anyone really interested in emulating DolbyA behavior, feel free to contact me, I lurk on Hoffman and Audiophile Style, where I blather about my crazy processing project. It is probably very off topic for this mailing list...
Have fun!!!
John
On Wednesday, February 12, 2025 at 06:01:43 PM EST, SoX NG <sox_ng@xxxxxxxxxxxx> wrote:
On 12/02/25 20:56, Doug Lee wrote:
> Off-list communication on this one welcome so we don't stray too far. :-)
Me, I tend to write off-list messages and then sent to the list by mistake.
Fortunately I seldom say anything too outrageous :-)
However, as long as it's relevant to anyone interested in precise low-level
audio processing...
> On Wed, Feb 12, 2025 at 08:21:34PM +0100, Martin Guy wrote:
>> Do the flow diagrams work for you
> On paper yes those can work if brailled right; but on a computer, at least I personally don't usually follow those.
Thanks, I'll keep that in mind.
> I understand compand and mcompand fairly well but am not an expert at crafting effects for specific
> purposes with them.
For that I had to dig out Ray Dolby's original paper plus a few other
descriptions
and measure his hand-drawn graphs with a ruler to figure out the curves'
coordinates
but was happy when one followed by the other produced something that
sounded the same.
The --plot output was very similar to the original diagrams so it went
fairly well.
Dolby B and C instead have a single sliding frequency band, not
something I think SoX can do,
but they are worse cheap versions for consumer boxes.
>> A recent example where I did not succeed was a recording someone sent me where the
>> volume cut way down at a certain point.
The new "softvol" effect may help here. It's a simple volume multiplier that
immediately reduces the volume when a sample would have clipped and
optionally increases the volume continuously so that it doubles every N
seconds.
I use it all the time to blast the quartiere with nonstop music as it
makes the
audibility of the result independent of the original recording volume
and stuff in the quiet passages as audible as in the loud ones.
Since I start at volume *= 400 (!) the very start of each track is
always interesting.
There's an extreme example of it at work under http://martinwguy.net/test -
the long filename full of sox effect names.
However, I'm not sure you *can* restore something like what you describe
to its original dynamics because information has been lost.
> I did, though, just write a small Python utility for scanning files via sound very fast
> using a two-stage SoX pipe and 1-10ms tones on a 256-frequency range to represent bytes
Do you mean a single tone that blips at one of 256 frequencies according
to its value and
recognizing certain characteristic sequences? Interesting.
One of my passions is for log-frequency-axis spectrograms, which
translate sound from
the audio domain to the visual one - not a lot of use to you but maybe
for deaf people to
let them see speech and music - however there are inverse techniques to
turn a spectrogram
back into the original sound, or a best effort at it. Peter Zinovieff's
tried this in the 1960s
and in one of his notebooks describes scanning a picture of a flower and
converting it into
sound as a particularly fascinating experiment. My own efforts in this
direction are documented
at https://wikidelia.net/wiki/Spectrograms#Inverse_spectrograms
That suggests adding JPG and PNG as input and output formats to SoX
as synonyms for spectrograms without axes, presumably with the lower and
upper frequencies and the pixel-columns-per-second and dynamic range
in image comments or as format parameters.
One thing at a time though...
M
_______________________________________________
Sox-users mailing list
Sox-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/sox-users
> Off-list communication on this one welcome so we don't stray too far. :-)
Me, I tend to write off-list messages and then sent to the list by mistake.
Fortunately I seldom say anything too outrageous :-)
However, as long as it's relevant to anyone interested in precise low-level
audio processing...
> On Wed, Feb 12, 2025 at 08:21:34PM +0100, Martin Guy wrote:
>> Do the flow diagrams work for you
> On paper yes those can work if brailled right; but on a computer, at least I personally don't usually follow those.
Thanks, I'll keep that in mind.
> I understand compand and mcompand fairly well but am not an expert at crafting effects for specific
> purposes with them.
For that I had to dig out Ray Dolby's original paper plus a few other
descriptions
and measure his hand-drawn graphs with a ruler to figure out the curves'
coordinates
but was happy when one followed by the other produced something that
sounded the same.
The --plot output was very similar to the original diagrams so it went
fairly well.
Dolby B and C instead have a single sliding frequency band, not
something I think SoX can do,
but they are worse cheap versions for consumer boxes.
>> A recent example where I did not succeed was a recording someone sent me where the
>> volume cut way down at a certain point.
The new "softvol" effect may help here. It's a simple volume multiplier that
immediately reduces the volume when a sample would have clipped and
optionally increases the volume continuously so that it doubles every N
seconds.
I use it all the time to blast the quartiere with nonstop music as it
makes the
audibility of the result independent of the original recording volume
and stuff in the quiet passages as audible as in the loud ones.
Since I start at volume *= 400 (!) the very start of each track is
always interesting.
There's an extreme example of it at work under http://martinwguy.net/test -
the long filename full of sox effect names.
However, I'm not sure you *can* restore something like what you describe
to its original dynamics because information has been lost.
> I did, though, just write a small Python utility for scanning files via sound very fast
> using a two-stage SoX pipe and 1-10ms tones on a 256-frequency range to represent bytes
Do you mean a single tone that blips at one of 256 frequencies according
to its value and
recognizing certain characteristic sequences? Interesting.
One of my passions is for log-frequency-axis spectrograms, which
translate sound from
the audio domain to the visual one - not a lot of use to you but maybe
for deaf people to
let them see speech and music - however there are inverse techniques to
turn a spectrogram
back into the original sound, or a best effort at it. Peter Zinovieff's
tried this in the 1960s
and in one of his notebooks describes scanning a picture of a flower and
converting it into
sound as a particularly fascinating experiment. My own efforts in this
direction are documented
at https://wikidelia.net/wiki/Spectrograms#Inverse_spectrograms
That suggests adding JPG and PNG as input and output formats to SoX
as synonyms for spectrograms without axes, presumably with the lower and
upper frequencies and the pixel-columns-per-second and dynamic range
in image comments or as format parameters.
One thing at a time though...
M
_______________________________________________
Sox-users mailing list
Sox-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/sox-users
_______________________________________________ Sox-users mailing list Sox-users@xxxxxxxxxxxxxxxxxxxxx https://lists.sourceforge.net/lists/listinfo/sox-users