Am 05.04.2013 01:08, schrieb Justin Chudgar: > I had experimentally thrown an optimization into my module's only significantly > warm functions. Since I am a novice, this was a just-for-kicks experiment, but > I would like to know whether to optimize at all beyond the general "-O2", and > what platforms are critical to consider since I only use pulse on systems that > are sufficient to run at "-O0" without noticeable problems beyond unnecessary > power consumption. > > From another thread: > >> I'm not sure what to think about the __attribute__((optimize(3))) usage. >> Have you done some benchmarking that shows that the speedup is >> significant compared to the normal -O2? If yes, I guess we can keep >> them. <tanuk> > I don't know what to think of them either. I did a really simplist benchmark > with the algorithm on my core i3 laptop initially to determine if it was > useful to keep everything double or float. There was no benefit to reducing > presicion on this one system, but that attribute was dramatic. Did not try O2, > though, just 03 and O0. I thought about messing with vectorization, but I only > have x86-64 PCs and that seems most valuable for embedded devices which I > cannot test at the moment. > *Any* optimization level is a *massive* speedup against -O0. -O0 means that all optimizations are disabled, including even basic stuff like putting variables into CPU registers. The generated asm by -O0 is probably 2-3x as large as on other levels. Comparing -O0 to anything is not useful, use it only for debugging. For useful comparisons compare -O{,s,1,2,3}. Best regards.