Re: Optimiziation questions (c++)

Tony Wetmore <tony.wetmore@xxxxxxxxxxxx> · Thu, 07 Feb 2008 11:00:46 -0500

NightStrike wrote:
>> (2)
>> Is there a big difference in speed between using a standard for-loop
>> for vector traversal and using iterators?  I already verified that I
>> can get a significant speed improvement (with very large N) by
>> accessing the vector directory with v[i] instead of using the at()
>> method.  But what about iterators?  If I write it with iterators
>> instead, will that be optimized out at -O3 to reduce to essentially
>> the same thing?
>
> I received feedback on #1.  Any ideas on #2 from this email?

Since you appear to already have a test case to measure and compare 
performance, why not simply do the investigation yourself?

I am not a GCC internals expert, but the at() method *should* be 
expected to perform much slower than direct indexing.  I believe the 
at() method is required to bounds-check every access to the vector, and 
throw an exception, while direct indexing is not.

As for iterator-loop optimization vs. for-loop optimization, that likely 
depends on whether the vector::size() method is inlined and the call 
optimized away by GCC.  In the iterator-loop case, it will depend on 
whether the vector::end() method is similarly optimized away, since that 
will need to be called for every loop iteration, just as vector::size() 
would be called in the for-loop.

I think an easy way to determine the precise optimization would be to 
look at the assembly code generated by GCC.  I would expect a very 
simple test case to result in a very manageable amount of assembly code 
to analyze.

I wrote a simple program to do just that and the optimized version using 
iterators would appear to be slightly more efficient to my untrained 
eyes, but that might depend on the exact cost of individual assembly 
instructions on your architecture.  In the -O3 case, each function 
contains approximately 20 lines of assembly code.

I cannot say what sort of improvement, if any, that might provide over a 
large data set, since I did not measure performance.

Here is the program I wrote:

#include <cstdio>
#include <vector>
void printForLoop( std::vector<int>& intVec )
{
  for( size_t i = 0; i < intVec.size(); ++i )
    printf( "%d\n", intVec[i] );
}
void printIterator( std::vector<int>& intVec )
{
  for( std::vector<int>::iterator it = intVec.begin(); it != 
intVec.end(); ++it )
    printf( "%d\n", *it );
}
int main()
{
  std::vector<int> intVec;
  for( size_t i = 0; i < 10; ++i )
    intVec.push_back( i );
  printForLoop( intVec );
  printIterator( intVec );
}

I examined the assembly code by disassembling the two functions in GDB. 
 When I compiled with -O3, neither function made a call to check the 
loop boundary conditions.  That is, both vector.size() and vector.end() 
were optimized away by GCC.

You could instead have GCC save the assembly code to a file, but loading 
the code into GDB seemed just as easy.  Either way, I highly recommend 
examining the generated assembly code yourself, as it can be quite 
instructive.

--
Tony Wetmore