On Tue, Dec 20, 2011 at 2:13 PM, Wander Lairson Costa <wander.lairson@xxxxxxxxx> wrote: > Dear all, > > I have a home made alpha blend code that used to work until gcc 4.5 > but fails on gcc 4.6 (tested on gcc 4.6.1 [ubuntu] and gcc 4.6.2 > [archlinux]) when I optimize code for speed (-O1). If I optimize for > size (-Os) it works fine. To make a long story short, the problem is > that when optimizing for speed, gcc generates code that accesses local > variables using the esp register, which cause troubles in some part of > my code that is written in assembly: > > __asm__ __volatile__ ( > /* Initialize the counter and skip */ > /* if the latter is equal to zero. */ > "movl %0,%%ecx\n\t" > "cmpl $0,%%ecx\n\t" > "jz not_blend\n\t" > > /* Load the frame buffer pointers into the registers. */ > > "pushl %%ebx\n\t" <------ HERE IS THE ROOT OF > THE PROBLEM > "movl %1,%%edi\n\t" <------ In this three lines > gcc accesses %1, %2, and %3 > "movl %2,%%esi\n\t" <------ variables using the esp register > "movl %3,%%ebx\n\t" <------ > > The problem is that inside the assembly code, I do a "pushl %%ebx" > instruction, which updates the esp register, and following it, I > access local variables using the "%n" idiom, but gcc (when optimizing > for speed) emits code that accesses the variables through esp > register, which is no longer valid. When no optimization is applied or > when optimizing for size, the local vars accesses are done through ebp > register, and everything runs fine. > > Now I am in doubt if I am loosing some spec detail in IA32 that > prohibit me from pushing things to the stack or if gcc is emitting > some kind of invalid code. Any ideas? > > Thanks in advance. > > -- > Best Regards, > Wander Lairson Costa Hi, Can you use 'O1 -fno-omit-frame-pointer'? Does O1 enable omit-frame-pointer? You can also try -fverbose-asm to see the list of -f options that are passed to cc1 (only useful in a -S compile). Maybe you can look for the culprit that is haunting your code that way? kevin