Hello,
I'm implementing a program do solve inverse problem by iteration but
with a huge matrix (5GB).
I'm using g++ 4.2.3 on Ubuntu (Hardy) with an Intel E8400 processor and
compared results with icc 10.1.018.
I have noticed two weird things
1° processing time:
with g++ iteration time slowly increases and with icc it doesn't.
Though I use exactly same program, same data and same computer. I did
many tests and this is consistent through all tests.
For instance it starts at 22s per iteration for g++ and icc then g++
code slows down to 24s after 10 iterations and then every two or three
iteration gains one second.
Note that this code is dominated by memory access since I do huge matrix
multiplication (float).
2° possible rounding problem :
at the end of each iteration I update the result matrix with the
following instruction
for( int i = 0; i < n; ++i )
res[i] += err[i] / k[i];
but with this code I loose convergence after a few iterations.
When changine this code to this
for( int i = 0; i < n; ++i )
{
err[i] /= k[i];
res[i] += err[i];
}
or this
for( int i = 0; i < n; ++i )
err[i] /= k[i];
for( int i = 0; i < n; ++i )
res[i] += err[i];
The program converges nicely without problem.
I saw the same problem with this type of result update code too
for( int i = 0; i < n; ++i )
res[i] *= err[i] / k[i];
I checked with -O2 and -O3 and same effect.
I tested on another processor (Xeon quadcore E5345) with g++ and icc,
same behavior.
I want to understand what is going on here because it is a vicious
"feature". I spent weeks trying to figure out why the program wouldn't
converge until I split the instruction to check intermediate results.
Could this be due to a rounding "feature" ?
BTW I would like to report that code compiled with g++ 4.2.3 is 25%
faster than code compiled with g++ 3.4.6 and that there is now no
significant difference with icc (regarding my code). Congratulation.