On Fri, Jan 11, 2013 at 12:46 PM, Václav Zeman <vhaisman@xxxxxxxxx> wrote: > > I am wondering if there are not some improvements that could be made in > generation of x86 FPU code. Here is a simple sign function: > > float signf4(float x) > { > return x < 0.0f > ? -1.0f > : 1.0f; > } > > It generates the following assembler code (GCC 4.7.2, g++ -m32 -O3 > -fverbose-asm -save-temps -g3 -ggdb -march=native): > > _Z6signf4f: > .LFB84: > .loc 1 27 0 > .cfi_startproc > .LVL4: > .loc 1 30 0 > fld1 > fldz > flds 4(%esp) # x > fxch %st(1) # ??? Why? > fucomip %st(1), %st #, > ffreep %st(0) # > fld1 > fchs > fcmovbe %st(1), %st #,, > fstp %st(1) # > .loc 1 31 0 > ret > > I am wondering why is the fxch instruction necessary and why is the code > not instead like this? > > _Z6signf4f: > .LFB84: > .loc 1 27 0 > .cfi_startproc > .LVL4: > .loc 1 30 0 > fld1 > flds 4(%esp) # ??? Load the parameter before the zero. > fldz # ??? to avoid the fxch instruction. > fucomip %st(1), %st #, > [...] I don't know. Please open a bug report as described at http://gcc.gnu.org/bugs . Thanks. Note that the 387 floating point operations are not high priority these days, as it is generally better to use SSE. Ian