Re: lto optimization

Oleg Endo <oleg.endo@xxxxxxxxxxx> · Wed, 04 May 2016 20:31:28 +0900

On Wed, 2016-05-04 at 12:57 +0200, Aurelien Buhrig wrote:

> I'm trying to generate LTO optimized code (with -O2 -flto).
> The code is very simple:
> 
> int *SP;
> int popInt(void) {return *--SP;}
> void pushInt(int v) {*SP++ = v;}
> void add (void) { pushInt(popInt()+popInt()); }
> 
> When the 3 global functions popInt, pushInt and add are in the same
> compilation unit, the lto optimization works as expected and
> generates
> something like
> -*(SP -2) += *(SP-1);
> SP--;
> 
> But when add is in a different compilation unit, gcc cannot succeed
> in
> doing such an optimization at link time. Worse, it does not inline
> the
> popInt function (but it inlines pushInt, so lto are performed).
> 
> I tried using the gcse options which are not enabled with O2, O3,
> adding -finline-functions and tuning inline limits without success.
> Any hint how I could get the same behavior with separated compilation
> unit ?
> 
> This is a gcc 4.6 on a private target.

I'd recommend trying a newer version.  Many improvements have been made
in the past 4 or 5 years.

Other than that, make sure that each compilation unit is compiled with 
-flto and the linking is done with the gcc driver program (also
specifying -flto) and not by invoking LD directly.

Cheers,
Oleg