Re: problems with gcc inline assembly using xmm registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Palao wrote:

__asm__ __volatile__ ("movsd %0, %%xmm3 \n\t" \
      "movsd %1, %%xmm6 \n\t" \
      "movsd %2, %%xmm4 \n\t" \
      "movsd %3, %%xmm7 \n\t" \
      "movsd %4, %%xmm5 \n\t" \
      "unpcklpd %%xmm3, %%xmm3 \n\t" \
      "unpcklpd %%xmm6, %%xmm6 \n\t" \
      "unpcklpd %%xmm4, %%xmm4 \n\t" \
      "mulpd %%xmm0, %%xmm3 \n\t" \
....
      "addpd %%xmm6, %%xmm5 \n\t" \
      "addpd %%xmm7, %%xmm3 \n\t" \
      "movsd %7, %%xmm6 \n\t" \
      "movsd %8, %%xmm7 \n\t" \
      "unpcklpd %%xmm6, %%xmm6 \n\t" \
      "unpcklpd %%xmm7, %%xmm7 \n\t" \
      "mulpd %%xmm1, %%xmm6 \n\t" \
      "mulpd %%xmm2, %%xmm7 \n\t" \
      "addpd %%xmm6, %%xmm4 \n\t" \
      "addpd %%xmm7, %%xmm5" \

don't write it this way, use the mmx builtins directly and then the compiler can handle all the register allocation for you. You'll have to be careful to arrange for no more than 8 mmx things to be live at one time though. That's not too hard to achieve if you're careful. I had success using this technique to do some 2D FFTs, it was way simpler than writing assembly directly.

nathan

--
Nathan Sidwell    ::   http://www.codesourcery.com   ::     CodeSourcery LLC
nathan@xxxxxxxxxxxxxxxx    ::     http://www.planetfall.pwp.blueyonder.co.uk


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux