Re: In GCC 10.2, -O2 optimization enables more than docs suggest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 21/01/2021 04:53, mark_at_yahoo wrote:
On 1/20/21 1:17 PM, David Brown wrote:

I /never/ use -O0, precisely because I find it absolutely terrible for
assembly level debugging.  You can't see the wood for the trees, as all
local variables are on the stack, and even the simplest of C expressions
ends up with large numbers of assembly instructions.  In my experience -
and this is obviously very subjective - using -O1 gives far more
readable assembly code while avoiding the kinds of code re-arrangement
and re-ordering of -O2 that makes assembly-level debugging difficult.
(-Og is an alternative for modern gcc versions, which can give most of
the speed of -O2 but is a little easier for debugging).

Interesting. My recollection is that -O0, regardless of variables being on the stack, was more "linear": Each C statement was followed more-or-less by the assembly code required to implement it, then the next C statement and so on.

Yes, that is true. But I find (and again this is my own experience, which may not match others - it may also depend on the way you write your code) that the amount of stack manipulation code makes it impossible to see what is really happening. For each "arithmetic" assembly line doing an add, you've got two or more loads and stores to the stack. And when the "real" assembly is a load or a store, you can't see it amongst all the other stack loads and stores.

It is even worse if you use C++. If you've got a few template classes and overloaded operators, -O0 might lead you down a path of a dozen nested function calls, each with their stack frames, when with -O1 you have just the one assembly instruction that is actually relevant.

With -O1, the assembly for an "add one function :

	int inc(int x) { return x + 1; }

is:

inc(int):
        adds    r0, r0, #1
       	bx      lr

That's simple, easy to understand, easy to follow, easy to step through at the assembly level.

With -O0, you get:

inc(int):
        push    {r7}
        sub     sp, sp, #12
        add     r7, sp, #0
        str     r0, [r7, #4]
        ldr     r3, [r7, #4]
        adds    r3, r3, #1
        mov     r0, r3
        adds    r7, r7, #12
        mov     sp, r7
        ldr     r7, [sp], #4
        bx      lr


I can't answer for anyone else, but I know which version /I/ would rather try to follow.

In particular, variables in -O1 can get tucked away into registers and "disappear" for long stretches of assembly before popping up again, and the spaghetti-code jumping for common code block elimination. Which is of course all good optimization, but makes things hard to follow.

With -O1, you get very little "spaghetti-jumping" - generated code follows linear paths that match the source to a large extent. -O2 is a different matter - that's when a lot more of the code re-ordering and re-arranging comes in. Yes, -O1 puts data in registers - IMHO that is a /good/ thing for debugging because it makes the code simpler and clearer. And yes, it makes some code and variables "disappear". That can be a good thing (clearing away unnecessary detail), or a bad thing.

Sometimes I'll temporarily add a "volatile" qualifier to a variable, or a "no_inline" attribute to a function in order to make debugging easier. That's part of the process. I actually do almost all of my compilation with -O2 as the starting point (and various fine-tuning flags) - and I do my debugging with that build. I strongly dislike the idea of different debug/release builds, and do not make a differentiation. If I need a lower optimisation to help trace an awkward problem, I'll add a "#pragma optimize 1" line to the relevant code.


But I'll have to revisit the issue again.


Another major benefit of -O1 is that it enables much more code analysis,
which in turn enables much better static checking - I am a big fan of
warning flags and having the compiler tell me of likely problems before
I get as far as testing and debugging.

Me, too ("big fan").

Well, remember that "-O0" greatly limits these warnings.

IMHO, gcc should introduced a new warning that is enabled by default on -O0, giving the message "You are using the world's most powerful compiler, but you are crippling it with your choice of flags." There should also be a warning if "-Wall" is not enabled.

(OK, perhaps that would be going a little too far...)



(Your project here looks very interesting - I'm going to have a good
look at it when I get the chance.  I won't be able to use it directly,
as a pure GPL license basically makes it unusable for anything but
learning or hobby use, but as it matches ideas I have had myself I am
interested in how it works.)

Thanks. Yes, it's basically a simple idea, and I found out recently that others have attempted something similar (which I wish I'd known when I started doing it myself). This is now very off-topic for this list, but I'd like to get your input, including the GPL vs LPGPL issue (ironic given that this is a GNU mailing list). Maybe open an issue at the Github repository and we can discuss it there?

I'll look through the project first, and then get back to you - either through Github or your email address. In the meantime, you could look at the licencing for FreeRTOS to see if its "GPL with exception" licence suits you. (gcc also has a kind of "GPL with exception" licence, otherwise it could not be used for anything other than GPL'ed code.)


mvh.,

David



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux