On Tue, Sep 1, 2020 at 10:41 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Mon, Aug 31, 2020 at 06:49:56PM -0700, Andrii Nakryiko wrote: > > + > > +__noinline int sub1(int x) > > +{ > > + return x + 1; > > +} > > + > > +static __noinline int sub5(int v); > > + > > +__noinline int sub2(int y) > > +{ > > + return sub5(y + 2); > > +} > > + > > +static __noinline int sub3(int z) > > +{ > > + return z + 3 + sub1(4); > > +} > > + > > +static __noinline int sub4(int w) > > +{ > > + return w + sub3(5) + sub1(6); > > Did you check that asm has these calls? Yeah, I actually did check. All calls are there. > Since sub3 is static the compiler doesn't have to do the call. > 'static noinline' doesn't mean that compiler have to do the call. > It can compute the value and replace a call with a constant. > It only has to keep the body of the function if the address of it > was taken. All these subX() functions are either global or call global function (sub1() is global), which seems to keep Clang from optimizing all this. Clang has to assume the worst case for global functions, probably due to LD_PRELOAD, right?