Intersample peaks - Was: Normalize to -1db peak

Len Ovens <len@xxxxxxxxxxxxx> · Tue, 21 Apr 2020 09:53:12 -0700 (PDT)

On Tue, 21 Apr 2020, Will Godfrey wrote:

Just an aside here. I'm in contact with a number of people who who work in
digital recording extensively. Their recommendation is that you should regard
your peak level as at least -3dB due to the potential of inter-sample
peaks. With the headroom available from even 16bit audio, you can go down to
-10dB and still not have any noise issues. To really appreciate this check here:

Lots of stuff out there on this. I came across an article that proclaims: 
"This worst-case example occurs when the audio tone is 1/4 of the sample 
rate." That is for 48000SR a 12Khz peak. This seems unreasonable to me 
though the idea that at 12Khz the peak could be as much as +3dbfs makes 
more sense. It would seem to me that intersample peak posibility varies 
with frequency and that the worst case would be 1/2 SR where the peak 
could be infinite (in which case it would be rendered as 0). However, if 
everything above 20k is filtered, that worst case is removed.

It would still seem that the worst case would be at the highest usable 
frequency (20Khz for audio). However, for recorded material (a signal that 
was once acoustic) this is less of a problem as most natual sounding 
microphones have a frequency responce that starts to drop off before 20Khz 
(closer to 16Khz) and so the signal close to 20Khz is already attenuated. 
The root frequency of the highest used note seems to be well under 10khz 
(from: https://www.zytrax.com/tech/audio/audio.html ) and so the range 
above 10Khz is mostly harmonics and therefore at lower levels anyway. 
(except for the trend in female vocals to use an exciter to "candie" the 
voice and make them sound 12 years old) This would indicate that even 
setting highest peak to -3 would be fine. This may be why 1/4 SR is quoted 
as worst case, it may be that in real world music, this is the 
highest frequency that that still has enough energy to peak over lower 
frequency sounds (they were testing with pre-recorded CDs).

If one is using internal soft synths, obvously any fundamental frequency 
can be generated or any harmonic can be emphasized beyond normal for 
artistic effect. I would like to see a graph that shows maximum 
intersample peak against frequency or against % sample rate.

I am sure there are some who feel that using 96Khz SR is the solution. Or 
who can hear a difference between 48k and 96k for this reason. Or maybe 
just use a more reasonable peak level to begin with. Considering 20db 
peaks in the audio above 85dBspl and a quiet listening space of 40dBspl... 
the dynamic range seems to be a maximum of about 65dB... lots of room 
within the 96dB range in 16 bit audio. (note that a really quiet studio 
may have a noise floor as low as 30dBspl)

So the reality is that there would be no harm in setting peaks to even -20 
except that "loud sounds better" and people want all their CDs, sound 
files etc. to sound about the same level. Another fun day in the audio 
world.

--
Len Ovens
www.ovenwerks.net
_______________________________________________
Linux-audio-user mailing list
Linux-audio-user@xxxxxxxxxxxxxxxxxxxx
https://lists.linuxaudio.org/listinfo/linux-audio-user