Hi, I've been getting unpredictable results with gcc -funroll-loops. It took me a great deal of time to diagnose why I was getting very poor performance, and when I finally tracked it down, I was at a loss to explain the results. Searching various forums didn't give me any relevant answers. Here's the sample code that finally helped me zero-in on the problem. #include<stdio.h> #include<time.h> #define DATA_COUNT 100000 #define INPUT_DIM 6 void doLoop(double x[DATA_COUNT][INPUT_DIM], double tau[INPUT_DIM], int inputDimension, long dataCount); int main(){ double x[DATA_COUNT][INPUT_DIM]; double tau[INPUT_DIM]= {0.2, 2.0, 8.5, 0.2, 0.5, 0.4}; long t0 = time(NULL); printf("Beginning loop...\n"); doLoop(x, tau, INPUT_DIM, DATA_COUNT); printf("Done. Time elapsed = %d seconds.\n",time(NULL) - t0); } void doLoop(double x[DATA_COUNT][INPUT_DIM], double tau[INPUT_DIM], int inputDimension, long dataCount){ int i,j,k; long t0,t1; double diff, rSquared; t0 = time(NULL); for (i=0;i<dataCount;i++){ for (j=0;j<dataCount;j++){ rSquared = 0; for (k=0;k<inputDimension;k++){ diff = (x[i][k] - x[j][k])/tau[k]; rSquared += diff*diff; } } } t1 = time(NULL); printf("Rows processed = %d.\n",i); //printf("t1 - t0 = %d seconds.\n",t1-t0); } I compiled this code and ran it, with the following results. [dgorur@flanker Desktop]$ gcc -O3 -funroll-loops loopTest.c [dgorur@flanker Desktop]$ ./a.out Beginning loop... Rows processed = 100000. Done. Time elapsed = 1 seconds. Now, I uncommented the last printf statement in the doLoops(...) function, and repeated the experiment, with the following results. [dgorur@flanker Desktop]$ gcc -O3 -funroll-loops loopTest.c [dgorur@flanker Desktop]$ ./a.out Beginning loop... Rows processed = 100000. t1 - t0 = 49 seconds. Done. Time elapsed = 49 seconds. How can there be such a fantastic difference? More than an order of magnitude! And it's sensitive to something that's completely outside the loop. This is crazy! How can I debug and tune my code if compiler behaviour is going to be so random? Surely this loop unrolling is not black magic!? Does anyone have any idea what's going on? By the way, removing the division by tau[k] brings the time back to 1 second. That is understandable, because it's actually *in* the innermost loop. Whether or not I time the thing after everything's done should make NO difference. Or maybe I do need a witch doctor and spells to help me speed up code. Regards, Dev -- View this message in context: http://www.nabble.com/Loop-unrolling%3A-black-magic-or-stochastic-process--tp24398027p24398027.html Sent from the gcc - Help mailing list archive at Nabble.com.