Re: g++ optimization issue / useless instructions for stack access

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/25/2014 01:32 PM, Niklas Gürtler wrote:
> Hello GCC List,
> 
> i am currently working on a hardware API in C++11 for ARM Cortex-M3
> microcontrollers. It provides an object oriented way of accessing
> hardware registers. The idea is that the user need not worry about
> individual registers and their composition of bit fields but can access
> these with symbolic names.
> The API uses temporary objects and call chaining for syntactic sugar.
> The problem is now that GCC produces correct, but way too slow and too
> much code.
> 
> See the attached simplified testcase (with a dummy linker script to
> shorten disassembler output) and the function getInput. When compiling
> with gcc-arm-embedded ( https://launchpad.net/gcc-arm-embedded ), this
> is the code generated by GCC:

With GCC 4.8.1 I get something similar.

I tried trunk GCC (for AArch64) and I get:

0000000000000000 <getInput()>:
   0:   d29ffdc0        mov     x0, #0xffee                     // #65518
   4:   f2a01800        movk    x0, #0xc0, lsl #16
   8:   b9400000        ldr     w0, [x0]
   c:   d3524800        ubfx    x0, x0, #18, #1
  10:   d65f03c0        ret

I think you'd get something very similar for 32-bit ARM.

But really, I think you are going down the wrong path.  If you want GCC
to generate tight code, you should write tight code.  Don't write lots
of pointless stuff in the hope that GCC will notice it's pointless.
Maybe it will, maybe not.  Your API is rather complicated for what it
does.  You should be able to write it in a way that is less work.

Andrew.





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux