Re: floating point precision on gcc-4 differs using variables or arrays

Brian Budge <brian.budge@xxxxxxxxx> · Fri, 25 Mar 2005 12:34:08 +0100

I'm suspecting that sse is being used for the arrays because they
happen to be appropriately sized.  If I remember correctly, gcc4 was
to introduce some autovectorization.  Perhaps that's whats going on.

  Brian

On Fri, 25 Mar 2005 10:39:07 +0000, Asfand Yar Qazi
<email@xxxxxxxxxxxxxxxxx> wrote:
> Hi,
> 
> Consider the following little bunch of code (perpared specially for
> this question:)
> 
> -------------CODE---------------
> #ifdef __cplusplus
> extern "C"
> #endif
> int printf(const char *format, ...);
> 
> /*
> 
> Lesson: store stuff either as variables, or in arrays - not both!
> 
> */
> 
> float a[4] = {0.123, -0.231, 0.652, 1};
> float b[4] = {-0.523, -9.6421, 0.0123, 1};
> 
> float a0 = 0.123,  a1 = -0.231, a2 = 0.652, a3 = 1;
> float b0 = -0.523, b1 = -9.6421, b2 = 0.0123, b3 = 1;
> 
> const int loopcount = 1000000;
> 
> void
> thefunc1(void)
> {
>         int i;
>         for(i = 0; i < loopcount; ++i) {
>                 a[0] = (b[0] * 0.9999f) + (b[0] * 0.00001f);
>                 a[1] = (b[1] * 0.9999f) + (b[1] * 0.00001f);
>                 a[2] = (b[2] * 0.9999f) + (b[2] * 0.00001f);
> 
>                 b[0] = a[0];
>                 b[1] = a[1];
>                 b[2] = a[2];
>         }
> }
> 
> void
> thefunc2(void)
> {
>         int i;
>         for(i = 0; i < loopcount; ++i) {
>                 a0 = (b0 * 0.9999f) + (b0 * 0.00001f);
>                 a1 = (b1 * 0.9999f) + (b1 * 0.00001f);
>                 a2 = (b2 * 0.9999f) + (b2 * 0.00001f);
> 
>                 b0 = a0;
>                 b1 = a1;
>                 b2 = a2;
>         }
> }
> 
> int
> main()
> {
>         thefunc1();
>         thefunc2();
> 
>         printf("t1: [%.24e, %.24e, %.24e, %.24e]\n", a[0], a[1], a[2], a[3]);
>         printf("t2: [%.24e, %.24e, %.24e, %.24e]\n", a0, a1, a2, a3);
> 
>         return 0;
> }
> -------------CODE---------------
> 
> Now, consider the output, using gcc 4 cvs and gcc 3.4.3 (compilation
> flags:  -ffast-math -march=pentium3 -msse -mfpmath=387 -O3
> -fno-unroll-loops )
> 
> ./gcc_error-3.4.3
> t1: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39,
> 1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
> t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39,
> 1.000807363220784352053728e-41, 1.000000000000000000000000e+00]
> 
> ./gcc_error-4
> t1: [-4.255302206601370422491529e-40, -7.845133972967615879964511e-39,
> 1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
> t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39,
> 1.000807363220784352053728e-41, 1.000000000000000000000000e+00]
> 
> Note how on gcc 3.4.3, using variables or arrays of floats gives the
> same results.  However, on gcc 4, it seems this is no longer the case
> (using the above flags, anyhow.)
> 
> Its not a bug, I assume, but could someone explain it to me why this
> is happening?
> 
> Then, observe the following with sse fpmath (flags: -ffast-math
> -march=pentium3 -msse -mfpmath=sse -O3 -fno-unroll-loops)
> 
> ./gcc_error-3.4.3
> t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> 
> ./gcc_error-4
> t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> 
> Using SSE seems to give the sames answers using variables or arrays.
> Wha?!?!?!  Now I'm even more confused.
> 
> Could someone explain the above to me?  Why is there a difference
> using arrays or variables in 387 maths, and not in SSE maths?
> 
> Also, is it better (i.e. more efficient, faster, etc.) to use
> variables or arrays to hold the data in matrix/vector classes in C++?
> 
> Thanks,
>         Asfand Yar
>