Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sun, 11 Jan 2009, Linus Torvalds wrote:
> On Sun, 11 Jan 2009, Andi Kleen wrote:
> > 
> > Was -- i think that got fixed in gcc. But again only in newer versions.
> 
> I doubt it. People have said that about a million times, it has never 
> gotten fixed, and I've never seen any actual proof.

In fact, I just double-checked.

Try this:

	struct a {
		unsigned long array[200];
		int a;
	};

	struct b {
		int b;
		unsigned long array[200];
	};

	extern int fn3(int, void *);
	extern int fn4(int, void *);

	static inline __attribute__((always_inline)) int fn1(int flag)
	{
		struct a a;
		return fn3(flag, &a);
	}

	static inline __attribute__((always_inline)) int fn2(int flag)
	{
		struct b b;
		return fn4(flag, &b);
	}

	int fn(int flag)
	{
		if (flag & 1)
			return fn1(flag);
		return fn2(flag);
	}

(yeah, I made sure it would inline with "always_inline" just so that the 
issue wouldn't be hidden by any "avoid stack frames" flags).

Gcc creates a big stack frame that contains _both_ 'a' and 'b', and does 
not merge the allocations together even though they clearly have no 
overlap in usage. Both 'a' and 'b' get 201 long-words (1608 bytes) of 
stack, causing the inlined version to have 3kB+ of stack, even though the 
non-inlined one would never use more than half of it.

So please stop claiming this is fixed. It's not fixed, never has been, and 
quite frankly, probably never will be because the lifetime analysis is 
hard enough (ie once you inline and there is any complex usage, CSE etc 
will quite possibly mix up the lifetimes - the above is clearly not any 
_realistic_ example).

So even if the above trivial case could be fixed, I suspect a more complex 
real-life case would still keep the allocations separate. Because merging 
the allocations and re-using the same stack for both really is pretty 
non-trivial, and the best solution is to simply not inline.

(And yeah, the above is such an extreme case that gcc seems to realize 
that it makes no sense to inline because the stack frame is _so_ big. I 
don't know what the default stack frame limit is, but it's apparently 
smaller than 1.5kB ;)

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux