Re: [patch] measurements, numbers about CONFIG_OPTIMIZE_INLINING=y impact

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Fri, 9 Jan 2009 16:41:33 -0800 (PST)

On Sat, 10 Jan 2009, Andi Kleen wrote:
>
> > What's the cost/benefit of that 4%? Does it actually improve performance? 
> > Especially if you then want to keep DWARF unwind information in memory in 
> > order to fix up some of the problems it causes? At that point, you lost 
> 
> dwarf unwind information has nothing to do with this, it doesn't tell
> you anything about inlining or not inlining.  It just gives you
> finished frames after all of that has been done.
> 
> Full line number information would help, but I don't think anyone 
> proposed to keep that in memory.

Yeah, true. Although one of the reasons inlining actually ends up causing 
problems is because of the bigger stack frames. That leaves a lot of space 
for old stale function pointers to peek through.

With denser stack frames, the stack dumps look better, even without an 
unwinder.

> > Does it help I$ utilization (which can speed things up a lot more, and is 
> > probably the main reason -Os actually tends to perform better)? Likely 
> > not. Sure, shrinking code is good for I$, but on the other hand inlining 
> > can actually be bad for I$ density because if you inline a function that 
> > doesn't get called, you now fragmented your footprint a lot more.
> 
> Not sure that is always true; the gcc basic block reordering 
> based on its standard branch prediction heuristics (e.g. < 0 or
> == NULL unlikely or the unlikely macro) might well put it all out of line.

I thought -Os actually disabled the basic-block reordering, doesn't it?

And I thought it did that exactly because it generates bigger code and 
much worse I$ patterns (ie you have a lot of "conditional branch to other 
place and then unconditional branch back" instead of "conditional branch 
over the non-taken code".

Also, I think we've had about as much good luck with guessing 
"likely/unlikely" as we've had with "inline" ;)

Sadly, apart from some of the "never happens" error cases, the kernel 
doesn't tend to have lots of nice patterns. We have almost no loops (well, 
there are loops all over, but most of them we hopefully just loop over 
once or twice in any good situation), and few really predictable things.

Or rather, they can easily be very predictable under one particular load, 
and the totally the other way around under another ..

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html