Re: problems with optimisation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dnia piątek 28 grudnia 2012 17:33:59 David Brown pisze:
> On 28/12/12 16:19, Andrew Haley wrote:
> > With -O2 there's much less difference:
> > 
> > bar():								bar():
> > 
> > .LFB14:								.LFB14:
> > 	.cfi_startproc							.cfi_startproc
> > 	movl	$3, %edx						movl	$3, %edx
> > 	in %dx, %al							in %dx, %al
> > 	
> > 	movb	$6, %dl					      |		movb	$4, %dl
> > 	movl	%eax, %ecx						movl	%eax, %ecx
> > 	in %dx, %al							in %dx, %al
> > 	
> > 							      >		movb	$6, %dl
> > 							      >		movl	%eax, %edi
> > 							      >		in %dx, %al
> > 	
> > 	movb	$7, %dl							movb	$7, %dl
> > 	movl	%eax, %esi						movl	%eax, %esi
> > 	
> > 							      >		andl	$1, %edi
> > 	
> > 	in %dx, %al							in %dx, %al
> > 	
> > 	movl	%eax, %edi				      |		movl	%eax, %r8d
> > 	
> > 							      >		movsbl	%sil, %esi
> > 	
> > 	movb	$8, %dl							movb	$8, %dl
> > 	subb	%dil, %cl				      |		subb	%r8b, %cl
> > 	in %dx, %al							in %dx, %al
> > 	
> > 	andl	$16, %esi				      |		addl	%edi, %ecx
> > 	
> > 							      >		testb	$16, %sil
> > 	
> > 	setne	%dl							setne	%dl
> > 	
> > 							      >		andl	$1, %esi
> > 	
> > 	addl	%edx, %ecx						addl	%edx, %ecx
> > 	
> > 							      >		subb	%sil, %cl
> > 	
> > 	testb	$16, %al						testb	$16, %al
> > 	setne	%al							setne	%al
> > 	subb	%al, %cl						subb	%al, %cl
> > 	movl	%ecx, %eax						movl	%ecx, %eax
> > 	ret								ret
> > 
> > Without inlining GCC can't tell what your program is doing, and by using
> > -Os you're preventing GCC from inlining.
> > 
> > Andrew.
> 
> There are normally good reasons for picking -Os rather than -O2 for
> small microcontrollers (the OP is targeting AVRs, which typically have
> quite small program flash memories).
> 
> So the solution here is to manually declare the various functions as
> "inline" (or at least "static", so that the compiler will inline them
> automatically).  Very often, code that manipulates bits is horrible on a
> target like the AVR if the function is not inline, and the compiler has
> the bit number(s) as variables - but with inline code generation and
> constant folding, you end up with only an instruction or two for
> compile-time constant bit numbers.
> 
> (To the OP) - also note that there can be significant differences in the
> types of code generation and optimisations for different backends.  I
> assume you posted x86 assembly because you thought it would be more
> familiar to people on this list, but I think it would be more important
> to show the real assembly from the target you are using as you might see
> different optimisations or missed optimisations.
> 
> Finally, there is a mailing list dedicated to gcc on the avr - it might
> be worth posting there too, especially if you think the issue is
> avr-specific.
> 
> David

David: you are right - I used x86 due to its popularity ;)

In my real case I'm observing weird thigs (speaking of inline): 

1. when in my code I use -Os and inline functions - gcc doesn't inline code 
(and AFAIR, generates warning about it wont't inline because code would 
grown).
Code looks funny then:

00000044 
<_ZNK7OneWire14InterruptBasedILt56ELh4EE10releaseBusEv.isra.0.1569.1517>:
  44:	bc 98       	cbi	0x17, 4	; 23
  46:	08 95       	ret


plus a few calls like:
rcall	.-262    	; 0x44 
<_ZNK7OneWire14InterruptBasedILt56ELh4EE10releaseBusEv.isra.0.1569.1517>


those calls are completly useless as 'cbi' could be placed instead of them, 
and the whole function actually consists of 1 command (except ret).
This is quite important for me as I loose certain amount of clock ticks here 
:)

2. when I use -Os and always_inline attribute, I get a messy code like in my 
first message (program gets bigger by 70%, and uses 2-3x more stack which is 
half of available memory).


It's hard to place whole avr program here as it's big, and it's difficult to 
introduce a smaller exmaple, because it's getting messy only when program gets 
bigger.

Andrew: it's inconvenient to use O2 as Os produces a progam which size is 30% 
of O2's result.

regards

-- 
Michał Walenciak
gmail.com kicer86
http://kicer.sileman.net.pl
gg: 3729519




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux