Converting Unsigned Byte Audio to 2-Channel WAV with SOX solved.

Martin McCormick <martin@xxxxxxxxxxxxxxxxxx> · Wed, 08 Mar 2006 10:07:55 -0600

Peder Hedlund writes:
> Have you tried replacing "-t ub" with "-t uw" ?

	That created an interesting effect.  The original data are
unsigned linear 8-bit samples.  Silence is represented by 0x7f or 0x80
with 0 being the low extreme and 0xff being the highest extreme.
Using uw or unsigned words told sox that this was 8,000 16-bit samples
per second.  It did that just fine and produced audio that was twice
the correct pitch.

> There's also a "-u" flag for "unsigned linear".

> What does 'file' say about your file?

	There is no header on that file so it just says "data."

	The -u flag appeared to have no effect since sox already
understood the data were unsigned linear.

sox: resample opts: Kaiser window, cutoff 0.950000, beta 16.000000

sox: Input file cdda.ub: using sample rate 8000
	size bytes, encoding unsigned, 1 channel
sox: Do not support unsigned with 16-bit data.  Forcing to Signed.
sox: Writing Wave file: Microsoft PCM format, 2 channels, 44100 samp/sec
sox:         176400 byte/sec, 4 block align, 16 bits/samp
sox: Output file output.wav: using sample rate 44100
	size shorts, encoding signed (2's complement), 2 channels
sox: Output file: comment "Processed by SoX"

sox: resample: rate ratio 80:441, coeff interpolation not needed

sox: Finished writing Wave file, 125728820 data bytes 62864410 samples

	I began to wonder if the problem was in the resampling
algorithm.  If you listen to the sound, the glitches are at regular
intervals at about 10 hits per second.  When there is voice present,
it sounds as if the "sixty Minutes" stopwatch was ticking away at
around ten ticks per second and there were little segments missing
from words.  As I previously sed, the pitch was correct.

	I then tried the following line to change the algorithm:

sox -V -r8000 cdda.ub -t wav -c 2 -w -r44100 output.wav polyphase .95 

	You can't tell any audible difference between the upconverted
.wav file and the original unsigned linear data file.

	Being at 8000 samples/sec, the audio is voice grade and is a
recording of two-way radio communications so it isn't the best anyway,
but it appears that using the polyphase algorithm fixed the problem.

	The latest Debian port of sox is

sox: Version 12.17.7.  I am not sure but what the version was
different a couple of years ago when I first wrote that script using
the resample algorithm instead of polyphase.  Maybe the previous
version of sox had some different defaults and it forced polyphase,
somehow.

	Thanks, Peder, for your suggestions on things to try as they
got me to look at everything more closely.

Martin McCormick WB5AGZ  Stillwater, OK 
Systems Engineer
OSU Information Technology Department Network Operations Group