Re: "may or may not", that is the question

John Fine <johnsfine@xxxxxxxxxxx> · Tue, 02 Sep 2008 17:19:41 -0400

David Bruant wrote:
Finally, I would like to know some reason that could make the code
slower by unrolling loops.

And, maybe, that we (I) could write to the people that write the manual
to add what will be said here to improve the manula, because I find the
"may or may not" quite weak for a manual.

Try a few experiments before jumping to your conclusions.  I have, and 
unrolling loops usually makes those loops slower.

It is a complicated situation and you may find that the option to unroll 
loops makes the total program faster despite making most loops slower (a 
few inner loops that took a lot of time might get faster while loops 
that took less time get slower).  But even that much is far from 
certain.  The total program might get slower.

Depending on details of compiler behavior that I don't know for gcc, 
there might be much stronger reasons than the following for loops to get 
slower when unrolled, but the following is sometimes enough:

1) Modern CPUs overlap a lot of work, so all the counting and jumping 
involved in a loop might happen to be fully overlapped and free, so 
there is nothing to be saved by unrolling.

2) By unrolling, you are always giving the L1 instruction cache more 
work to do.  Depending on complex issues of the instruction mix and 
decode overheads etc. the cost of fetching all those extra instructions 
might outweigh everything else.  So after saving nothing because of 
factor (1) you then pay a lot for it by factor (2).