Re: Optimisation puzzle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lawrence Crowl skrev:
On 3/16/07, Erik <sigra@xxxxxxx> wrote:
Ian Lance Taylor skrev:
> Erik <sigra@xxxxxxx> writes:
>
>
>>  From man:puts I see that it is declared "int puts(const char
>> *)". This means that puts does not promise to leave its argument
>> unchanged. Therefore the caller must push the argument anew before
>> each call. If it had been declared "int puts(const char * const)"
>> instead, the push should be moved outside the loop. Unfortunately this
>> does not seem to work. I tried with the following program:
>> void q(const unsigned int);
>> void f() {for (unsigned int x = 0; x != 10; x++) q(77);}
>>
>> and built it with "gcc -std=c99 -Os -Wall -Wextra -Werror -S":
>> .L2:
>>         subl    $12, %esp
>>         incl    %ebx
>>         pushl   $77
>>         call    q
>>         addl    $16, %esp
>>         cmpl    $10, %ebx
>>         jne     .L2
>>
>> As you can see, "pushl   $77" is still inside the loop even though q
>> promises to not change its argument. This must be a bug.
>
> This is not a bug.  const on an automatic variable in C is more
> advisory than anything else.  You are not permitted to change a const
> object, but you can cast its address to a non-const pointer.

The "lost optimization" in this case has nothing to do with const
versus non-const.  The issue is that the call deallocates the
parameters. The pushl is allocating the argument.
OK, as I understand it, the pushl $77 does 2 things:
1. Copy the value 77.
2. Allocate space for it (by changing the stack pointer).

And in C, allocating the space must be done in each iteration because the called function deallocates it. But copying the value 77 in each iteration could be optimized away.

But what about Ada? Does it have the same calling convetion? See the Ada loop:
.L5:
       pushl   $77
.LCFI3:
       call    _ada_q
       popl    %eax
       decl    %ebx
       jns     .L5


It looks like in Ada, the caller deallocates the parameter (popl). If this is so, it would mean that both the push and the pop should be moved out of the loop, something like:
       pushl   $77
.L5:
.LCFI3:
       call    _ada_q
       decl    %ebx
       jns     .L5
       popl    %eax

This would cut the loop down to only 3 instructions. Not bad compared to the 7 instructions in the original C loop. But the C version should probably be counted as having 8 instructions to make the comparison accurate, since it has a hidden popl inside q, which is not in Q of the Ada version. Is this correct?

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux