moving data between x87 and xmm registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
I am trying to vectorize a piece of code using SSE 2 intrinsics (the
one's in emmintrin.h).I am using double precision floating point
arithmetic.The running times I obtained were very similar with and
without the vectorization. I suspect the reason for this is that in
the vectorized code, I am storing the contents of a packed xmm
register (represented by an __m128d variable) into a double array.

Looking into the assembly code generated, I saw that for this, the
contents of the xmm register were first saved to a memory location and
then loaded into the x87 FPU stack. Apparently there is no direct way
to transfer data between x87 and xmm registers. One way to eliminate
this would be to use xmm registers for all floating point
calculations. But inspite of using -march=prescott and -mfpmath=sse,
x87 instructions like fld and fstp are still used. Is there any to
force GCC to use only the xmm registers for all floating point
calculations?(I tried using the -mno-80387 option but I am getting
lots of weird linker errors with that). Or is there anyway to move
data between x87 and xmm registers without using memory as an
intermediary ?

Regards
Gautam

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux