Hi,
Consider the following little bunch of code (perpared specially for this question:)
-------------CODE--------------- #ifdef __cplusplus extern "C" #endif int printf(const char *format, ...);
/*
Lesson: store stuff either as variables, or in arrays - not both!
*/
float a[4] = {0.123, -0.231, 0.652, 1}; float b[4] = {-0.523, -9.6421, 0.0123, 1};
float a0 = 0.123, a1 = -0.231, a2 = 0.652, a3 = 1; float b0 = -0.523, b1 = -9.6421, b2 = 0.0123, b3 = 1;
const int loopcount = 1000000;
void thefunc1(void) { int i; for(i = 0; i < loopcount; ++i) { a[0] = (b[0] * 0.9999f) + (b[0] * 0.00001f); a[1] = (b[1] * 0.9999f) + (b[1] * 0.00001f); a[2] = (b[2] * 0.9999f) + (b[2] * 0.00001f);
b[0] = a[0]; b[1] = a[1]; b[2] = a[2]; } }
void thefunc2(void) { int i; for(i = 0; i < loopcount; ++i) { a0 = (b0 * 0.9999f) + (b0 * 0.00001f); a1 = (b1 * 0.9999f) + (b1 * 0.00001f); a2 = (b2 * 0.9999f) + (b2 * 0.00001f);
b0 = a0; b1 = a1; b2 = a2; } }
int main() { thefunc1(); thefunc2();
printf("t1: [%.24e, %.24e, %.24e, %.24e]\n", a[0], a[1], a[2], a[3]); printf("t2: [%.24e, %.24e, %.24e, %.24e]\n", a0, a1, a2, a3);
return 0; } -------------CODE---------------
Now, consider the output, using gcc 4 cvs and gcc 3.4.3 (compilation flags: -ffast-math -march=pentium3 -msse -mfpmath=387 -O3 -fno-unroll-loops )
./gcc_error-3.4.3
t1: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39, 1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39, 1.000807363220784352053728e-41, 1.000000000000000000000000e+00]
./gcc_error-4
t1: [-4.255302206601370422491529e-40, -7.845133972967615879964511e-39, 1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39, 1.000807363220784352053728e-41, 1.000000000000000000000000e+00]
Note how on gcc 3.4.3, using variables or arrays of floats gives the same results. However, on gcc 4, it seems this is no longer the case (using the above flags, anyhow.)
Its not a bug, I assume, but could someone explain it to me why this is happening?
Then, observe the following with sse fpmath (flags: -ffast-math -march=pentium3 -msse -mfpmath=sse -O3 -fno-unroll-loops)
./gcc_error-3.4.3
t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
./gcc_error-4
t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
Using SSE seems to give the sames answers using variables or arrays. Wha?!?!?! Now I'm even more confused.
Could someone explain the above to me? Why is there a difference using arrays or variables in 387 maths, and not in SSE maths?
Also, is it better (i.e. more efficient, faster, etc.) to use variables or arrays to hold the data in matrix/vector classes in C++?
Thanks, Asfand Yar