Paul Dovydaitis <paul.dovy@xxxxxxxxx> writes: > I have been playing around with the profile guided optimization flags > in GCC and had a few questions. I actually noticed that with the > particular application I was profiling, I in some cases got worse > results after recompiling with -fprofile-use and re-running the same > test I used to generate the profile. > > I could not find any detailed documentation about how the profile is > applied during the optimization steps, but my working hypothesis is as > follows. The application in question is very latency sensitive, so it > spins in a tight loop while polling for data from various sources. > When data is received, it then processes it â and this processing is > the portion in which speed is most important and which I am timing. > If one were to look at hit counts for individual functions or source > lines, though, it would appear that the polling loop was âhotâ while > everything else by comparison was extremely âcoldâ. Global > optimizations that use this sort of data would then likely hurt, > rather than help those processing times. While it's hard to know for sure, your explanation is plausible. > What optimizations does -fprofile-use turn on that would be sensitive > to this sort of behavior? Does this seem like a reasonable > explanation, and if so is there anything I can tweak to help? The > specific flags that -fprofile-use turns on (-fbranch-probabilities, > -fvpt, -funroll-loops, -fpeel-loops and âftracer) donât seem like they > would be affected since these optimizations appear local in scope, but > I assume there are other things happening under the hood. Profiling affects inlining decisions. Ian