On Sat, Aug 10, 2013 at 06:06:26PM -0600, Anthony Foiani wrote: > > Greetings. > > Chatting on IRC today (freenode #gcc), someone brought up the > following example code: > > #include <stdio.h> > > template <typename T> > inline T const& max (T const& a, T const& b) > { > // if a < b then use b else use a > return a<b?b:a; > } > > int main() > { > long long unsigned sum = 0; > for(int x = 1; x <= 100000000; x++) > { > sum+=max(x,x+1); > } > printf("%llu\n", sum); > } > > They noticed that their earlier compiler (4.6.3 -O3) successfully > reduced the loop, while 4.8.1 didn't. > For general enligthenment, could you quickly explain what "reduce the loop" means? One can of course just get rid off the loop, precomputing sum, but I guess that's not what you mean? (I don't understand the assembler code.) With gcc-4.7.1 I get for the following (with a bigger loop): #include <iostream> #include <cassert> template <typename T> inline T const& max (T const& a, T const& b) { return a<b?b:a; } int main() { typedef long long unsigned UInt; typedef long long Index; UInt sum = 0; constexpr UInt n = 100000000000; for (Index x = 0; x < Index(n); ++x) sum+=max(x,x+1); assert(sum == (n * (n+1))/2); std::cout << sum << "\n"; // prints "932356074711512064" for 64-bit long long } > g++ --std=c++11 -Ofast -funroll-loops -Wall Example.cpp > time ./a.out 932356074711512064 real 0m56.278s user 0m56.126s sys 0m0.001s Interestingly, when changing Index to UInt I get real 1m7.261s user 1m7.076s sys 0m0.018s which is, at least for me, an unexpected difference? (I would have assumed that, if there is a difference, then Index=UInt should be faster.) Oliver