Re: extended asm - memory constraint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



mc2718 <mc2718@xxxxxxxxx> writes:

> The short code below highlights the problem. Instead of 10, the output is a
> random value (I compiled it with
> g++ -o tst -O3 -Wall -march=native -mtune=native tst.C). If you add "memory"
> to the clobber list, you get correct output.
>
>
> #include <iostream>
>
> using namespace std;
>
> struct int128 {
>
>   unsigned long long int lo;
>   unsigned long long int hi;
>
> } __attribute__ ((aligned(16))) ;
>
>
> int128 testFN(unsigned long long a) {
>   unsigned long long alo = (unsigned int) a;
>   unsigned long long ahi = (a >> 32);
>   int128 res;
>   asm __volatile__ (
>     // compute basic products (alo * alo, ahi * ahi, 2 * alo * ahi) 
>     "movq      (%0)  , %%xmm7 \n\t"   // xmm7 = (alo, 0)
>     "movhpd    (%1)  , %%xmm7 \n\t"   // xmm7 = (alo, ahi)
>     "movdqa  %%xmm7  , (%2)   \n\t"   // res.lo = alo, res.hi = ahi
>     : //"=&m" (res)
>     : "rV" (&alo), "rV" (&ahi), "rV" (&res)
>     : "%xmm7" //, "memory"
>   );
>   return  res;
> }

This is wrong.  You are telling gcc that the asm code needs the
address of alo and ahi, but you aren't telling gcc that it needs the
value.  This causes gcc to never bother to initialize alo and ahi at
all.  Adding the memory clobber works by forcing gcc to initialize the
memory.

This works:

int128 testFN(unsigned long long a) {
  unsigned long long alo = (unsigned int) a;
  unsigned long long ahi = (a >> 32);
  int128 res;
  asm __volatile__ (
    // compute basic products (alo * alo, ahi * ahi, 2 * alo * ahi) 
    "movq      %1  , %%xmm7 \n\t"   // xmm7 = (alo, 0)
    "movhpd    %2  , %%xmm7 \n\t"   // xmm7 = (alo, ahi)
    "movdqa  %%xmm7  , %0   \n\t"   // res.lo = alo, res.hi = ahi
    : "=m" (res)
    : "m" (alo), "m" (ahi)
    : "%xmm7"
  );
  return  res;
}

You might also consider using the intrinsic functions rather than
explicit asm statements.

Ian


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux