[linux-audio-user] Re: Resampling Audio Libraries & Sinc Resamplers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Part 2 of (maybe) 3

For those with lack of familiarity with Fourier analysis and 
synthesis, here is a concrete example to demonstrate potentially
serious problems with sinc resamplers in doing bulk conversions at 
constant rates.  These problems are real and could easily result in
audible artifacts --- something that I assume is of importance
to Linux audio users --- and especially with further processing.

------------------------

File 123163main_cas-skr1-112203.wav is the NASA file recently 
mentioned on LAU --- a public-domain, taxpayer-supported WAV file 
sampled at 5000 samples per second.  This file was chosen arbitrarily 
--- just happened to resample it before reverbing and posting for 
interested LAU'ers a while ago, so decided to use it for a 
comparison for Steve Harris.

Two resamplers:

1) sinc resampler: 

$ sndfile-resample -to 44100 -c 0 123163main_cas-skr1-112203.wav \
saturn_sndfile-resample.wav


2) FFT with large windows:

Sampster in Mixster (stuff I wrote myself)


Comparison was every 50th sample in the original file with every 441st 
sample in the other two (should match exactly every 0.01 seconds) for 
the first 9.5 seconds of the files.  9.5 seconds was chosen rather 
arbitrarily --- nothing special about it.  Ideally these particular 
samples should match exactly. Any error indicates corruption of the 
original data at the exact locations where the original samples were 
taken.  The last two columns show you the difference between what is 
expected at these matching points and what was actually obtained after 
resampling.  Note that the values in the last column are significantly 
greater than those in the next-to-last column.


Match#      Original  FFT     sndfile           # FFT sndfile (diffs)
1:          0.00000 -2.00000 19.0000            1: -2   19
2:          386.000 384.000 437.000             2: -2   51
3:         -181.000 -183.000 -178.000           3: -2    3
4:         -500.000 -502.000 -538.000           4: -2  -38
5:         -1065.00 -1067.00 -1068.00           5: -2   -3
6:         -54.0000 -56.0000 -28.0000           6: -2   26
7:         -120.000 -122.000 -55.0000           7: -2   65
8:         -348.000 -350.000 -344.000           8: -2    4
9:          827.000 825.000 805.000             9: -2  -22

<snip>

344:         -67.0000 -71.0000 100.000          344: -4  167
345:         -378.000 -382.000 -275.000         345: -4  103
346:         -37.0000 -41.0000 -101.000         346: -4  -64
347:         -209.000 -213.000 -19.0000         347: -4  190
348:          269.000 265.000 86.0000           348: -4 -183
349:          62.0000 58.0000 27.0000           349: -4  -35
350:          427.000 423.000 446.000           350: -4   19
351:          154.000 150.000 -47.0000          351: -4 -201
352:          619.000 615.000 52.0000           352: -4 -567
353:         -202.000 -206.000 111.000          353: -4  313
354:         -366.000 -370.000 205.000          354: -4  571   <<< 
   OUCH!  Hope this doesn't get expanded.  Over 100x larger error.

355:         -146.000 -150.000 8.00000          355: -4  154
356:          549.000 545.000 558.000           356: -4    9
357:          279.000 275.000 -34.0000          357: -4 -313
358:         -110.000 -114.000 -12.0000         358: -4   98
359:         -184.000 -188.000 199.000          359: -4  383
360:         -215.000 -219.000 -417.000         360: -4 -202
361:          244.000 240.000 74.0000           361: -4 -170
362:         -474.000 -478.000 -152.000         362: -4  322
363:          188.000 184.000 562.000           363: -4  374

<snip>

938:         -1448.00 -1449.00 -1468.00         938: -1  -20
939:         -1203.00 -1204.00 -1161.00         939: -1   42
940:          3210.00 3209.00 3111.00           940: -1  -99 <<< about 
   100x larger error at 10% full scale

941:          5767.00 5766.00 5838.00           941: -1   71
942:         -656.000 -657.000 -628.000         942: -1   28
943:         -5165.00 -5166.00 -5163.00         943: -1    2
944:          1547.00 1546.00 1584.00           944: -1   37
945:          4410.00 4409.00 4445.00           945: -1   35
946:          1912.00 1911.00 1881.00           946: -1  -31
947:          5947.00 5946.00 5829.00           947: -1 -118 <<< Over 
   100x larger error at 18% full scale.

948:          5923.00 5922.00 5902.00           948: -1  -21
949:          3462.00 3461.00 3494.00           949: -1   32

What this shows is that at every 0.01 seconds, where the original file 
and the resampled file should have the same exact value (if the 
original data were preserved), large errors occur for sndfile-
resample. 

------------------------

resample-1.7 was even worse with a phase shift on top of this type of 
inaccuracy, coupled with rather serious spectral leakage beyond 2.5 
kHz which was the original band limit (or might as well be assumed to 
have been).  Upon examining the waveforms, I could see that resample-1.7
was doing an excellent job of tracing out the original waveform by
drawing pretty much straight lines between points.  Although visually
reassuring, this actually adds spectral components that were not 
in the original.  So it depends on what you want.  This resampling
probably won't sound like the original, but does look good in an
editor.

------------------------

Also of interest is that the very latest version of sndfile-resample
gives slightly different results than an earlier version for the 
locations which should match (the versions are for libsamplerate):

Match#       v 0.0.15     v 0.1.2
1:            19.0000     18.0000
2:           437.0000    436.0000
3:          -178.0000   -179.0000
4:          -538.0000   -539.0000
5:         -1068.0000  -1068.0000
6:           -28.0000    -28.0000
7:           -55.0000    -55.0000
8:          -344.0000   -344.0000
9:           805.0000    805.0000
10:          -81.0000    -82.0000
11:          482.0000    482.0000
12:           78.0000     77.0000
13:          227.0000    227.0000
14:          501.0000    500.0000
15:           13.0000     12.0000

<snip>

So the *amount* of corruption of the original data at locations which
should match varies with version!  Fortunately (or perhaps unfortunately 
depending upon your point of view) this latest version never varies
more than 2 from the earlier version, so the "latest and greatest" is 
just as bad.  The errors in the table above would be altered by 2 or
less, which is insignificant.




[Index of Archives]     [Linux Sound]     [ALSA Users]     [Pulse Audio]     [ALSA Devel]     [Sox Users]     [Linux Media]     [Kernel]     [Photo Sharing]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux