On Wednesday 17 December 2008 00:37:48 ext Siarhei Siamashka wrote: > On Monday 15 December 2008 17:16:58 ext Brad Midgley wrote: > > I like your idea of using a macro with the original floating point > > tables, as long as we know it is done at compile time, not runtime :) > > What about something like this modification to Jaska's patch? It contains > floating point constants wrapped into a macro. > > This version is using 16-bit multiplications only (additional natural > change would be just to convert 'sbc_encoder_state->' to int16_t because it > does not need to be int32_t), which is good for performance for the > platforms with fast 16-bit integer multiplication. But it is also flexible > enough to be changed to use 32x32->64 multiplications just by replacing > FIXED_A and FIXED_T types to int64_t and int32_t respectively (for better > precision or experiments with conformance testing). > > > > Can anybody try to remember/explain what transformations were applied > > > to the existing fixed point implementation? > > > > it was done by several people and the only record we have is in cvs. > > (part of it is in the old btsco project's cvs) > > Regarding the code optimizations. Looking at the tables, It can be seen > that 'cos_table_fixed_8[0+hop]' is always equal to > 'cos_table_fixed_8[8+hop]'. The same is true for 'cos_table_fixed_8[1+hop]' > and 'cos_table_fixed_8[7+hop]' So it is possible to join 't1[0] + t1[8]', > 't1[1]+ t1[7]' and the other such pairs, effectively halving the number of > counters. This looks very much like the optimization that was applied to > the current fixed point code :) > > But now it would be very interesting to see if the conformance tests pass > rate is better with the new filtering function. Here is one more attempt at improving filtering function. Now I tried to get the best possible audio quality (using 32-bit fixed point). 16-bit version of filtering function can be enabled by just commenting out '#define SBC_HIGH_PRECISION' line The improvements include fixing a problem in scalefactors processing code. Here we don't want to use the absolute value just because it is possible to encode more negative values than positive values with the same number of bits: - while (scalefactor[ch][sb] < fabs(frame->sb_sample_f[blk][ch][sb])) { + while ((scalefactor[ch][sb] << SCALE_OUT_BITS) <= neginv(frame->sb_sample_f[blk][ch][sb])) { Another quality improvement is achieved by keeping more bits in the output of filtering function, thus avoiding unnecessary precision loss on quantizing stage. Both of these changes also naturally improve audio quality for the 16-bit variant. We had a talk with Jaska Uimonen here, and now I'm kind of delegated to finish the work on this filtering function for SBC encoder (including the final addition of ARM assembly optimizations). He provided me with his last variant of code, which contains some more optimizations to reduce the number of operations and also loops unrolling. I will add his changes to the patch on next iteration. Now the question is how to best integrate a fixed filtering function to git repository? If I just continue adding changes to the patch in order to make it a faster, it will be also not so obvious to see how we got to these code transformations just from the commit log. I intentionally keep posting work-in-progress variants just to keep track of the history at least in this mailing list archive :) As always, feedback is very much welcome. Best regards, Siarhei Siamashka
diff --git a/sbc/sbc.c b/sbc/sbc.c index 5411893..873c370 100644 --- a/sbc/sbc.c +++ b/sbc/sbc.c @@ -40,6 +40,7 @@ #include <string.h> #include <stdlib.h> #include <sys/types.h> +#include <limits.h> #include "sbc_math.h" #include "sbc_tables.h" @@ -742,124 +743,108 @@ static inline void sbc_analyze_four(struct sbc_encoder_state *state, static inline void _sbc_analyze_eight(const int32_t *in, int32_t *out) { - sbc_fixed_t t[8], s[8]; - - t[0] = SCALE8_STAGE1( /* Q10 */ - MULA(_sbc_proto_8[0], (in[16] - in[64]), /* Q18 = Q18 * Q0 */ - MULA(_sbc_proto_8[1], (in[32] - in[48]), - MULA(_sbc_proto_8[2], in[4], - MULA(_sbc_proto_8[3], in[20], - MULA(_sbc_proto_8[4], in[36], - MUL( _sbc_proto_8[5], in[52]))))))); - - t[1] = SCALE8_STAGE1( - MULA(_sbc_proto_8[6], in[2], - MULA(_sbc_proto_8[7], in[18], - MULA(_sbc_proto_8[8], in[34], - MULA(_sbc_proto_8[9], in[50], - MUL(_sbc_proto_8[10], in[66])))))); - - t[2] = SCALE8_STAGE1( - MULA(_sbc_proto_8[11], in[1], - MULA(_sbc_proto_8[12], in[17], - MULA(_sbc_proto_8[13], in[33], - MULA(_sbc_proto_8[14], in[49], - MULA(_sbc_proto_8[15], in[65], - MULA(_sbc_proto_8[16], in[3], - MULA(_sbc_proto_8[17], in[19], - MULA(_sbc_proto_8[18], in[35], - MULA(_sbc_proto_8[19], in[51], - MUL( _sbc_proto_8[20], in[67]))))))))))); - - t[3] = SCALE8_STAGE1( - MULA( _sbc_proto_8[21], in[5], - MULA( _sbc_proto_8[22], in[21], - MULA( _sbc_proto_8[23], in[37], - MULA( _sbc_proto_8[24], in[53], - MULA( _sbc_proto_8[25], in[69], - MULA(-_sbc_proto_8[15], in[15], - MULA(-_sbc_proto_8[14], in[31], - MULA(-_sbc_proto_8[13], in[47], - MULA(-_sbc_proto_8[12], in[63], - MUL( -_sbc_proto_8[11], in[79]))))))))))); - - t[4] = SCALE8_STAGE1( - MULA( _sbc_proto_8[26], in[6], - MULA( _sbc_proto_8[27], in[22], - MULA( _sbc_proto_8[28], in[38], - MULA( _sbc_proto_8[29], in[54], - MULA( _sbc_proto_8[30], in[70], - MULA(-_sbc_proto_8[10], in[14], - MULA(-_sbc_proto_8[9], in[30], - MULA(-_sbc_proto_8[8], in[46], - MULA(-_sbc_proto_8[7], in[62], - MUL( -_sbc_proto_8[6], in[78]))))))))))); - - t[5] = SCALE8_STAGE1( - MULA( _sbc_proto_8[31], in[7], - MULA( _sbc_proto_8[32], in[23], - MULA( _sbc_proto_8[33], in[39], - MULA( _sbc_proto_8[34], in[55], - MULA( _sbc_proto_8[35], in[71], - MULA(-_sbc_proto_8[20], in[13], - MULA(-_sbc_proto_8[19], in[29], - MULA(-_sbc_proto_8[18], in[45], - MULA(-_sbc_proto_8[17], in[61], - MUL( -_sbc_proto_8[16], in[77]))))))))))); - - t[6] = SCALE8_STAGE1( - MULA( _sbc_proto_8[36], (in[8] + in[72]), - MULA( _sbc_proto_8[37], (in[24] + in[56]), - MULA( _sbc_proto_8[38], in[40], - MULA(-_sbc_proto_8[39], in[12], - MULA(-_sbc_proto_8[5], in[28], - MULA(-_sbc_proto_8[4], in[44], - MULA(-_sbc_proto_8[3], in[60], - MUL( -_sbc_proto_8[2], in[76]))))))))); - - t[7] = SCALE8_STAGE1( - MULA( _sbc_proto_8[35], in[9], - MULA( _sbc_proto_8[34], in[25], - MULA( _sbc_proto_8[33], in[41], - MULA( _sbc_proto_8[32], in[57], - MULA( _sbc_proto_8[31], in[73], - MULA(-_sbc_proto_8[25], in[11], - MULA(-_sbc_proto_8[24], in[27], - MULA(-_sbc_proto_8[23], in[43], - MULA(-_sbc_proto_8[22], in[59], - MUL( -_sbc_proto_8[21], in[75]))))))))))); - - s[0] = MULA( _anamatrix8[0], t[0], - MUL( _anamatrix8[1], t[6])); - s[1] = MUL( _anamatrix8[7], t[1]); - s[2] = MULA( _anamatrix8[2], t[2], - MULA( _anamatrix8[3], t[3], - MULA( _anamatrix8[4], t[5], - MUL( _anamatrix8[5], t[7])))); - s[3] = MUL( _anamatrix8[6], t[4]); - s[4] = MULA( _anamatrix8[3], t[2], - MULA(-_anamatrix8[5], t[3], - MULA(-_anamatrix8[2], t[5], - MUL( -_anamatrix8[4], t[7])))); - s[5] = MULA( _anamatrix8[4], t[2], - MULA(-_anamatrix8[2], t[3], - MULA( _anamatrix8[5], t[5], - MUL( _anamatrix8[3], t[7])))); - s[6] = MULA( _anamatrix8[1], t[0], - MUL( -_anamatrix8[0], t[6])); - s[7] = MULA( _anamatrix8[5], t[2], - MULA(-_anamatrix8[4], t[3], - MULA( _anamatrix8[3], t[5], - MUL( -_anamatrix8[2], t[7])))); - - out[0] = SCALE8_STAGE2( s[0] + s[1] + s[2] + s[3]); - out[1] = SCALE8_STAGE2( s[1] - s[3] + s[4] + s[6]); - out[2] = SCALE8_STAGE2( s[1] - s[3] + s[5] - s[6]); - out[3] = SCALE8_STAGE2(-s[0] + s[1] + s[3] + s[7]); - out[4] = SCALE8_STAGE2(-s[0] + s[1] + s[3] - s[7]); - out[5] = SCALE8_STAGE2( s[1] - s[3] - s[5] - s[6]); - out[6] = SCALE8_STAGE2( s[1] - s[3] - s[4] + s[6]); - out[7] = SCALE8_STAGE2( s[0] + s[1] - s[2] + s[3]); + FIXED_A t1[16]; + FIXED_T t2[16]; + FIXED_A R; + int i, hop; + + /* rounding coefficient */ + R = (FIXED_A)1 << (SBC_PROTO_FIXED8_SCALE-1); + + /* low pass polyphase filter */ + t1[0] = (FIXED_A)in[0] * _sbc_proto_fixed8[0]; + t1[1] = (FIXED_A)in[1] * _sbc_proto_fixed8[1]; + t1[2] = (FIXED_A)in[2] * _sbc_proto_fixed8[2]; + t1[3] = (FIXED_A)in[3] * _sbc_proto_fixed8[3]; + t1[4] = (FIXED_A)in[4] * _sbc_proto_fixed8[4]; + t1[5] = (FIXED_A)in[5] * _sbc_proto_fixed8[5]; + t1[6] = (FIXED_A)in[6] * _sbc_proto_fixed8[6]; + t1[7] = (FIXED_A)in[7] * _sbc_proto_fixed8[7]; + t1[8] = (FIXED_A)in[8] * _sbc_proto_fixed8[8]; + t1[9] = (FIXED_A)in[9] * _sbc_proto_fixed8[9]; + t1[10] = (FIXED_A)in[10] * _sbc_proto_fixed8[10]; + t1[11] = (FIXED_A)in[11] * _sbc_proto_fixed8[11]; + /* t1[12] = (FIXED_A)in[12] * _sbc_proto_fixed8[12]; */ + t1[13] = (FIXED_A)in[13] * _sbc_proto_fixed8[13]; + t1[14] = (FIXED_A)in[14] * _sbc_proto_fixed8[14]; + t1[15] = (FIXED_A)in[15] * _sbc_proto_fixed8[15]; + + hop = 16; + for (i = 0; i < 4; i++) { + t1[0] += (FIXED_A)in[hop] * _sbc_proto_fixed8[hop]; + t1[1] += (FIXED_A)in[hop + 1] * _sbc_proto_fixed8[hop + 1]; + t1[2] += (FIXED_A)in[hop + 2] * _sbc_proto_fixed8[hop + 2]; + t1[3] += (FIXED_A)in[hop + 3] * _sbc_proto_fixed8[hop + 3]; + t1[4] += (FIXED_A)in[hop + 4] * _sbc_proto_fixed8[hop + 4]; + t1[5] += (FIXED_A)in[hop + 5] * _sbc_proto_fixed8[hop + 5]; + t1[6] += (FIXED_A)in[hop + 6] * _sbc_proto_fixed8[hop + 6]; + t1[7] += (FIXED_A)in[hop + 7] * _sbc_proto_fixed8[hop + 7]; + t1[8] += (FIXED_A)in[hop + 8] * _sbc_proto_fixed8[hop + 8]; + t1[9] += (FIXED_A)in[hop + 9] * _sbc_proto_fixed8[hop + 9]; + t1[10] += (FIXED_A)in[hop + 10] * _sbc_proto_fixed8[hop + 10]; + t1[11] += (FIXED_A)in[hop + 11] * _sbc_proto_fixed8[hop + 11]; + /* t1[12] += (FIXED_A)in[hop + 12] * _sbc_proto_fixed8[hop + 12]; */ + t1[13] += (FIXED_A)in[hop + 13] * _sbc_proto_fixed8[hop + 13]; + t1[14] += (FIXED_A)in[hop + 14] * _sbc_proto_fixed8[hop + 14]; + t1[15] += (FIXED_A)in[hop + 15] * _sbc_proto_fixed8[hop + 15]; + + hop += 16; + } + + /* scaling */ + t2[0] = (t1[0] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[1] = (t1[1] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[2] = (t1[2] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[3] = (t1[3] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[4] = (t1[4] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[5] = (t1[5] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[6] = (t1[6] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[7] = (t1[7] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[8] = (t1[8] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[9] = (t1[9] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[10] = (t1[10] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[11] = (t1[11] + R) >> SBC_PROTO_FIXED8_SCALE; + /* t2[12] = (t1[12] + R) >> SBC_PROTO_FIXED8_SCALE; */ + t2[13] = (t1[13] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[14] = (t1[14] + R) >> SBC_PROTO_FIXED8_SCALE; + t2[15] = (t1[15] + R) >> SBC_PROTO_FIXED8_SCALE; + + R = (FIXED_A)1 << (SBC_COS_TABLE_FIXED8_SCALE-1-SCALE_OUT_BITS); + + /* do the cos transform */ + hop = 0; + for (i = 0; i < 8; i++) { + t1[i] = (FIXED_A)t2[0] * cos_table_fixed_8[0 + hop]; + t1[i] += (FIXED_A)t2[1] * cos_table_fixed_8[1 + hop]; + t1[i] += (FIXED_A)t2[2] * cos_table_fixed_8[2 + hop]; + t1[i] += (FIXED_A)t2[3] * cos_table_fixed_8[3 + hop]; + /* cos_table_fixed_8[4 + hop] = 1.0 */ + t1[i] += (FIXED_A)t2[4] << (sizeof(FIXED_T)*CHAR_BIT-1); + t1[i] += (FIXED_A)t2[5] * cos_table_fixed_8[5 + hop]; + t1[i] += (FIXED_A)t2[6] * cos_table_fixed_8[6 + hop]; + t1[i] += (FIXED_A)t2[7] * cos_table_fixed_8[7 + hop]; + t1[i] += (FIXED_A)t2[8] * cos_table_fixed_8[8 + hop]; + t1[i] += (FIXED_A)t2[9] * cos_table_fixed_8[9 + hop]; + t1[i] += (FIXED_A)t2[10] * cos_table_fixed_8[10 + hop]; + t1[i] += (FIXED_A)t2[11] * cos_table_fixed_8[11 + hop]; + /* cos_table_fixed_8[12 + hop] = 0.0 */ + /* t1[i] += (FIXED_A)t2[12] * cos_table_fixed_8[12 + hop]; */ + t1[i] += (FIXED_A)t2[13] * cos_table_fixed_8[13 + hop]; + t1[i] += (FIXED_A)t2[14] * cos_table_fixed_8[14 + hop]; + t1[i] += (FIXED_A)t2[15] * cos_table_fixed_8[15 + hop]; + + hop += 16; + } + + /* scaling */ + out[0] = (t1[0] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[1] = (t1[1] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[2] = (t1[2] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[3] = (t1[3] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[4] = (t1[4] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[5] = (t1[5] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[6] = (t1[6] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); + out[7] = (t1[7] + R) >> (SBC_COS_TABLE_FIXED8_SCALE-SCALE_OUT_BITS); } static inline void sbc_analyze_eight(struct sbc_encoder_state *state, @@ -1006,7 +991,7 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) frame->scale_factor[ch][sb] = 0; scalefactor[ch][sb] = 2; for (blk = 0; blk < frame->blocks; blk++) { - while (scalefactor[ch][sb] < fabs(frame->sb_sample_f[blk][ch][sb])) { + while ((scalefactor[ch][sb] << SCALE_OUT_BITS) <= neginv(frame->sb_sample_f[blk][ch][sb])) { frame->scale_factor[ch][sb]++; scalefactor[ch][sb] *= 2; } @@ -1040,11 +1025,11 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) frame->sb_sample_f[blk][1][sb]) >> 1; /* calculate scale_factor_j and scalefactor_j for joint case */ - while (scalefactor_j[0] < fabs(sb_sample_j[blk][0])) { + while ((scalefactor_j[0] << SCALE_OUT_BITS) <= neginv(sb_sample_j[blk][0])) { scale_factor_j[0]++; scalefactor_j[0] *= 2; } - while (scalefactor_j[1] < fabs(sb_sample_j[blk][1])) { + while ((scalefactor_j[1] << SCALE_OUT_BITS) <= neginv(sb_sample_j[blk][1])) { scale_factor_j[1]++; scalefactor_j[1] *= 2; } @@ -1100,11 +1085,11 @@ static int sbc_pack_frame(uint8_t *data, struct sbc_frame *frame, size_t len) for (ch = 0; ch < frame->channels; ch++) { for (sb = 0; sb < frame->subbands; sb++) { if (levels[ch][sb] > 0) { - audio_sample = - (uint16_t) (((((int64_t)frame->sb_sample_f[blk][ch][sb]*levels[ch][sb]) >> - (frame->scale_factor[ch][sb] + 1)) + - levels[ch][sb]) >> 1); - PUT_BITS(audio_sample & levels[ch][sb], bits[ch][sb]); + int32_t sample = frame->sb_sample_f[blk][ch][sb]; + int32_t s_shift = (frame->scale_factor[ch][sb] + 1 + SCALE_OUT_BITS); + int32_t ls = levels[ch][sb]; + audio_sample = ((((int64_t)1 << s_shift) + sample) * ls) >> (s_shift + 1); + PUT_BITS(audio_sample, bits[ch][sb]); } } } diff --git a/sbc/sbc_math.h b/sbc/sbc_math.h index b3d87a6..b53e3d1 100644 --- a/sbc/sbc_math.h +++ b/sbc/sbc_math.h @@ -23,12 +23,14 @@ * */ -#define fabs(x) ((x) < 0 ? -(x) : (x)) +#define neginv(x) ((x) < 0 ? ~(x) : (x)) /* C does not provide an explicit arithmetic shift right but this will always be correct and every compiler *should* generate optimal code */ #define ASR(val, bits) ((-2 >> 1 == -1) ? \ ((int32_t)(val)) >> (bits) : ((int32_t) (val)) / (1 << (bits))) +#define SCALE_OUT_BITS 14 + #define SCALE_PROTO4_TBL 15 #define SCALE_ANA4_TBL 17 #define SCALE_PROTO8_TBL 16 @@ -38,7 +40,7 @@ #define SCALE_NPROTO4_TBL 11 #define SCALE_NPROTO8_TBL 11 #define SCALE4_STAGE1_BITS 15 -#define SCALE4_STAGE2_BITS 16 +#define SCALE4_STAGE2_BITS (16-SCALE_OUT_BITS) #define SCALE4_STAGED1_BITS 15 #define SCALE4_STAGED2_BITS 16 #define SCALE8_STAGE1_BITS 15 diff --git a/sbc/sbc_tables.h b/sbc/sbc_tables.h index f5daaa7..eeea7b7 100644 --- a/sbc/sbc_tables.h +++ b/sbc/sbc_tables.h @@ -166,3 +166,88 @@ static const int32_t synmatrix8[16][8] = { { SN8(0xf9592678), SN8(0x018f8b84), SN8(0x07d8a5f0), SN8(0x0471ced0), SN8(0xfb8e3130), SN8(0xf8275a10), SN8(0xfe70747c), SN8(0x06a6d988) } }; + +#define SBC_HIGH_PRECISION + +#ifdef SBC_HIGH_PRECISION +# define FIXED_A int64_t /* data type for fixed point accumulator */ +# define FIXED_T int32_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 15 +#else +# define FIXED_A int32_t /* data type for fixed point accumulator */ +# define FIXED_T int16_t /* data type for fixed point constants */ +# define SBC_FIXED8_EXTRA_BITS 0 +#endif + +/* A2DP specification: Section 12.8 Tables */ +#define SBC_PROTO_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT-1-SBC_FIXED8_EXTRA_BITS) +#define F(x) (FIXED_T)(FIXED_A)((x)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T _sbc_proto_fixed8[80] = { + F(0.00000000E+00), F(1.56575398E-04), F(3.43256425E-04), F(5.54620202E-04), + F(8.23919506E-04), F(1.13992507E-03), F(1.47640169E-03), F(1.78371725E-03), + F(2.01182542E-03), F(2.10371989E-03), F(1.99454554E-03), F(1.61656283E-03), + F(9.02154502E-04),-F(1.78805361E-04),-F(1.64973098E-03),-F(3.49717454E-03), + F(5.65949473E-03), F(8.02941163E-03), F(1.04584443E-02), F(1.27472335E-02), + F(1.46525263E-02), F(1.59045603E-02), F(1.62208471E-02), F(1.53184106E-02), + F(1.29371806E-02), F(8.85757540E-03), F(2.92408442E-03),-F(4.91578024E-03), + -F(1.46404076E-02),-F(2.61098752E-02),-F(3.90751381E-02),-F(5.31873032E-02), + F(6.79989431E-02), F(8.29847578E-02), F(9.75753918E-02), F(1.11196689E-01), + F(1.23264548E-01), F(1.33264415E-01), F(1.40753505E-01), F(1.45389847E-01), + F(1.46955068E-01), F(1.45389847E-01), F(1.40753505E-01), F(1.33264415E-01), + F(1.23264548E-01), F(1.11196689E-01), F(9.75753918E-02), F(8.29847578E-02), + -F(6.79989431E-02),-F(5.31873032E-02),-F(3.90751381E-02),-F(2.61098752E-02), + -F(1.46404076E-02),-F(4.91578024E-03), F(2.92408442E-03), F(8.85757540E-03), + F(1.29371806E-02), F(1.53184106E-02), F(1.62208471E-02), F(1.59045603E-02), + F(1.46525263E-02), F(1.27472335E-02), F(1.04584443E-02), F(8.02941163E-03), + -F(5.65949473E-03),-F(3.49717454E-03),-F(1.64973098E-03),-F(1.78805361E-04), + F(9.02154502E-04), F(1.61656283E-03), F(1.99454554E-03), F(2.10371989E-03), + F(2.01182542E-03), F(1.78371725E-03), F(1.47640169E-03), F(1.13992507E-03), + F(8.23919506E-04), F(5.54620202E-04), F(3.43256425E-04), F(1.56575398E-04), +}; +#undef F + +/* + * To produce this cosine matrix in Octave: + * + * b = zeros(8, 16); + * for i = 0:7 for j = 0:15 b(i+1, j+1) = cos( (i + 0.5) * (j - 4) * (pi/8) ) endfor endfor; + * printf("%.10f, ", b'); + * + */ +#define SBC_COS_TABLE_FIXED8_SCALE (sizeof(FIXED_T)*CHAR_BIT-1+SBC_FIXED8_EXTRA_BITS) +#define F(x) (FIXED_T)(FIXED_A)((x)*((FIXED_A)1<<(sizeof(FIXED_T)*CHAR_BIT-1))+0.5) +static const FIXED_T cos_table_fixed_8[128] = { + F(0.7071067812), F(0.8314696123), F(0.9238795325), F(0.9807852804), + F(1.0000000000), F(0.9807852804), F(0.9238795325), F(0.8314696123), + F(0.7071067812), F(0.5555702330), F(0.3826834324), F(0.1950903220), + F(0.0000000000),-F(0.1950903220),-F(0.3826834324),-F(0.5555702330), + -F(0.7071067812),-F(0.1950903220), F(0.3826834324), F(0.8314696123), + F(1.0000000000), F(0.8314696123), F(0.3826834324),-F(0.1950903220), + -F(0.7071067812),-F(0.9807852804),-F(0.9238795325),-F(0.5555702330), + -F(0.0000000000), F(0.5555702330), F(0.9238795325), F(0.9807852804), + -F(0.7071067812),-F(0.9807852804),-F(0.3826834324), F(0.5555702330), + F(1.0000000000), F(0.5555702330),-F(0.3826834324),-F(0.9807852804), + -F(0.7071067812), F(0.1950903220), F(0.9238795325), F(0.8314696123), + F(0.0000000000),-F(0.8314696123),-F(0.9238795325),-F(0.1950903220), + F(0.7071067812),-F(0.5555702330),-F(0.9238795325), F(0.1950903220), + F(1.0000000000), F(0.1950903220),-F(0.9238795325),-F(0.5555702330), + F(0.7071067812), F(0.8314696123),-F(0.3826834324),-F(0.9807852804), + -F(0.0000000000), F(0.9807852804), F(0.3826834324),-F(0.8314696123), + F(0.7071067812), F(0.5555702330),-F(0.9238795325),-F(0.1950903220), + F(1.0000000000),-F(0.1950903220),-F(0.9238795325), F(0.5555702330), + F(0.7071067812),-F(0.8314696123),-F(0.3826834324), F(0.9807852804), + F(0.0000000000),-F(0.9807852804), F(0.3826834324), F(0.8314696123), + -F(0.7071067812), F(0.9807852804),-F(0.3826834324),-F(0.5555702330), + F(1.0000000000),-F(0.5555702330),-F(0.3826834324), F(0.9807852804), + -F(0.7071067812),-F(0.1950903220), F(0.9238795325),-F(0.8314696123), + -F(0.0000000000), F(0.8314696123),-F(0.9238795325), F(0.1950903220), + -F(0.7071067812), F(0.1950903220), F(0.3826834324),-F(0.8314696123), + F(1.0000000000),-F(0.8314696123), F(0.3826834324), F(0.1950903220), + -F(0.7071067812), F(0.9807852804),-F(0.9238795325), F(0.5555702330), + -F(0.0000000000),-F(0.5555702330), F(0.9238795325),-F(0.9807852804), + F(0.7071067812),-F(0.8314696123), F(0.9238795325),-F(0.9807852804), + F(1.0000000000),-F(0.9807852804), F(0.9238795325),-F(0.8314696123), + F(0.7071067812),-F(0.5555702330), F(0.3826834324),-F(0.1950903220), + -F(0.0000000000), F(0.1950903220),-F(0.3826834324), F(0.5555702330), +}; +#undef F