Re: Inefficient stack usage

Hiroshi Shimamoto <h-shimamoto@xxxxxxxxxxxxx> · Mon, 15 Dec 2008 09:11:51 -0800

Ian Lance Taylor wrote:
> Hiroshi Shimamoto <h-shimamoto@xxxxxxxxxxxxx> writes:
> 
>> I noticed that the stack usage of the code gcc-4.x generated looks inefficient
>> on x86 and x86_64. I found this looking the assemble code of Linux kernel.
>> Is this inefficient stack usage a regression?
> 
> It does seem to be a regression in this case.  It seems to be the
> result of the tree reassociation pass.  That pass reassociates the
> trees in order to expose redundancies which can then be eliminated.
> Your code ties all the expressions together via the | operation, and
> those all get sorted together.  This increases the live length of the
> operands, and nothing ever fixes it up.
> 
>> I made a simple test case.
> 
> Note that your test case is wrong.
> 
>> #define copy_from_asm(x, addr, err)	\
>> asm volatile(				\
>> 	"1:\tmovl %2, %1\n"		\
>> 	"2:\n"				\
>> 	".section .fixup,\"ax\"\n"	\
>> 	"\txor %1,%1\n"			\
>> 	"\tmov $1,%0\n"			\
>> 	"\tjmp 2b\n"			\
>> 	".previous\n"			\
>> 	: "=r" (err), "=r" (x)		\
>> 	: "m" (*(int*)(addr)))
> 
> This says that it sets "err", but it doesn't always do so.  I modified
> the last line to this:
> 	: "m" (*(int*)(addr)), "0" (err))
> which ensures that the register holding 'err' is initialized.

Thanks for looking and correction.
I'll look what mistake I made, when simplifying this issue.

> 
> Please feel free to report a bug; see http://gcc.gnu.org/bugs.html .

Will do.

> 
> Note that your code relies on the fact that the asm does not change
> err in the normal case.  You will get much better code if you take
> advantage of that fact:

Thanks for pointing about this, I'm working to change like this.

Thanks,
Hiroshi