Hmm, I wasn't saying that there could be not-enough-sampling error. I am saying that it could be that we picked a path to benachmark that isn't optimized by flto. My critical path took tens of micro and ran about 400K times, and I see about ~10% performance hit. ----- Original Message ----- From: Uri Moszkowicz <uri@xxxxxxxxx> To: Hei Chan <structurechart@xxxxxxxxx> Cc: gcc-help <gcc-help@xxxxxxxxxxx> Sent: Wednesday, June 12, 2013 6:09 AM Subject: Re: -flto making program slower? Let me clarify then. The benchmark consisted of a few dozen runs of the program each with different inputs, with the average time given. For this program, each run typically takes a few hours so there's little risk of sample error. On Tue, Jun 11, 2013 at 6:01 PM, Hei Chan <structurechart@xxxxxxxxx> wrote: > I ran into similar issue but I got no response here. > > My guess is that flto makes your average run time smaller. But I guess most > people use it in a hope of making their critical pathes running faster, > which are benchmarked. > > > ________________________________ > From: Uri Moszkowicz <uri@xxxxxxxxx> > To: gcc-help <gcc-help@xxxxxxxxxxx> > Sent: Wednesday, June 12, 2013 5:48 AM > Subject: -flto making program slower? > > Hi, > I'm having trouble with link time optimization in my application. It > is a large application that uses only basic C++ (no exceptions, no > templates, no STL, no floating point, etc). Like many applications, > the source files are compiled separately into object files. Some of > those are combined into shared libraries. The shared libraries are > then statically linked with the remaining object files to produce the > final application, which is about 100MB big. We are using GCC 4.7.2 > with a non-GOLD 2.23.1 binutils. > > I simply added "-flto" to the GCC command to create object files: > g++ -Wall -pipe -O3 -flto -fno-strict-aliasing -mtune=generic > --no-exceptions -fPIC -c some.cc > > I then added "-flto" to the final link command but not the shared libraries: > g++ -o exec -Xlinker some1.o some2.o -static some1.a some2.a > -Wl,--wrap,open -flto > > I ran a benchmark of tests and the resulting execution time is now > about 7% higher than it was without "-flto" added. Any suggestions for > how to improve this result or why it may have gotten slower? > > If it helps, when I add "-fuse-linker-plugin" I get this error: > g++: error: -fuse-linker-plugin is not supported in this configuration > > Thanks! > >