Christophe Meessen wrote: > I'm implementing a program do solve inverse problem by iteration but > with a huge matrix (5GB). > > I'm using g++ 4.2.3 on Ubuntu (Hardy) with an Intel E8400 processor and > compared results with icc 10.1.018. > > I have noticed two weird things > > 1° processing time: > with g++ iteration time slowly increases and with icc it doesn't. > Though I use exactly same program, same data and same computer. I did > many tests and this is consistent through all tests. > > For instance it starts at 22s per iteration for g++ and icc then g++ > code slows down to 24s after 10 iterations and then every two or three > iteration gains one second. > > Note that this code is dominated by memory access since I do huge matrix > multiplication (float). > > > 2° possible rounding problem : > > at the end of each iteration I update the result matrix with the > following instruction > > for( int i = 0; i < n; ++i ) > res[i] += err[i] / k[i]; > > but with this code I loose convergence after a few iterations. > When changine this code to this > > for( int i = 0; i < n; ++i ) > { > err[i] /= k[i]; > res[i] += err[i]; > } > > or this > > for( int i = 0; i < n; ++i ) > err[i] /= k[i]; > > for( int i = 0; i < n; ++i ) > res[i] += err[i]; > > The program converges nicely without problem. > > I saw the same problem with this type of result update code too > > for( int i = 0; i < n; ++i ) > res[i] *= err[i] / k[i]; > > I checked with -O2 and -O3 and same effect. > > I tested on another processor (Xeon quadcore E5345) with g++ and icc, > same behavior. > > I want to understand what is going on here because it is a vicious > "feature". I spent weeks trying to figure out why the program wouldn't > converge until I split the instruction to check intermediate results. > > Could this be due to a rounding "feature" ? It's possible that res[i] += err[i] / k[i]; might give different results on x86 from err[i] /= k[i]; res[i] += err[i]; due to intermediate 80-bit precision being used for the result. However, if you have a 5 Gb matrix I'm assuming you are on a 64-bit system, and 64-bit platforms don't use the 80-bit intermediate precision. What does "uname -a" say? Andrew.