mmintrin slower than inline asm or even plain C

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi guys,

i write to you direct because i can't find the relevant mailing list
for help with the mmintrin functions.  there's a thread at:

 http://gcc.gnu.org/ml/gcc-help/2007-04/msg00201.html

that details my problems.

i want to sum an array of longs using mmx.  i use the functions:
   _mm_set_pi32 and _m_paddd
but the resultant binary contains significantly less efficient code
than inline asm or even plain C ( for(i=0;i<n;i++)total+=a[i]; ).
here's the relevant function:

simd_mmintrin(n, is)
I *is;
{   __m64 q,r;
  I i;
  _m_empty();
  q=_m_from_int(0);
  for (i=0; i < n; i+=W) {
      r=_mm_set_pi32(is[i],is[i+1]);
      q=_m_paddd(q,r);
  }
  union {long a[2];__m64 m;}u;
  u.m=q;
  return u.a[0]+u.a[1];
}

and the rest of the code and a shell script to run it is in the thread above.

thank you,


jack

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux