Re: problems with gcc inline assembly using xmm registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for the answer!

>
> I had some trouble using mmx and xmm registers, too. But my code works
> now. See the attached code snippet. You can get an idea of how to use the
> input list, output list and clobber list and also how to share xmm
> registers between different asm inline blocks.
>

I will read it, but it looks hardcore to me (as I'm new in assembly)


> You use only memory operands, so this shouldn't be a problem. But you
> haven't declared any output. You're clobbering xmm registers, but you
> don't tell the compiler that you do so. Maybe that's the problem.

What's the problem if I don't need output operands?

Concerning to clobbering part; well, I have tried clobbering xmm registers as 
well (I hope I did it right). For instance:

__asm__ __volatile__ ("movsd %0, %%xmm3 \n\t" \
      "movsd %1, %%xmm6 \n\t" \
      "movsd %2, %%xmm4 \n\t" \
      "movsd %3, %%xmm7 \n\t" \
      "movsd %4, %%xmm5 \n\t" \
      "unpcklpd %%xmm3, %%xmm3 \n\t" \
      "unpcklpd %%xmm6, %%xmm6 \n\t" \
      "unpcklpd %%xmm4, %%xmm4 \n\t" \
      "mulpd %%xmm0, %%xmm3 \n\t" \
      "unpcklpd %%xmm7, %%xmm7 \n\t" \
      "mulpd %%xmm1, %%xmm6 \n\t" \
      "unpcklpd %%xmm5, %%xmm5 \n\t" \
      "mulpd %%xmm0, %%xmm4 \n\t" \
      "addpd %%xmm6, %%xmm3 \n\t" \
      "mulpd %%xmm2, %%xmm7 \n\t" \
      "mulpd %%xmm0, %%xmm5 \n\t" \
      "addpd %%xmm7, %%xmm4 \n\t" \
      "movsd %5, %%xmm6 \n\t" \
      "movsd %6, %%xmm7 \n\t" \
      "unpcklpd %%xmm6, %%xmm6 \n\t" \
      "unpcklpd %%xmm7, %%xmm7 \n\t" \
      "mulpd %%xmm1, %%xmm6 \n\t" \
      "mulpd %%xmm2, %%xmm7 \n\t" \
      "addpd %%xmm6, %%xmm5 \n\t" \
      "addpd %%xmm7, %%xmm3 \n\t" \
      "movsd %7, %%xmm6 \n\t" \
      "movsd %8, %%xmm7 \n\t" \
      "unpcklpd %%xmm6, %%xmm6 \n\t" \
      "unpcklpd %%xmm7, %%xmm7 \n\t" \
      "mulpd %%xmm1, %%xmm6 \n\t" \
      "mulpd %%xmm2, %%xmm7 \n\t" \
      "addpd %%xmm6, %%xmm4 \n\t" \
      "addpd %%xmm7, %%xmm5" \
      : \
      : \
      "m" ((u).c11.real()), \
      "m" ((u).c12.real()), \
      "m" ((u).c21.real()), \
      "m" ((u).c23.real()), \
      "m" ((u).c31.real()), \
      "m" ((u).c32.real()), \
      "m" ((u).c13.real()), \
      "m" ((u).c22.real()), \
      "m" ((u).c33.real())  \
: \
"%xmm0", \
"%xmm1", \
"%xmm2", \
"%xmm3", \
"%xmm4", \
"%xmm5", \
"%xmm6", \
"%xmm7" );

BUT it doesn't work either way (with/without clobbering list).
Any idea???

Regards

David


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux