Re: prefetch question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Witzel wrote:

The option -fprefetch-loop-arrays generates prefetch commands for
non-vectorized code,
but not for vectorized one. Is that the intended functionality ? Is
there a way to get prefetching also
for the vectorized routines ?
Thanks, Thomas

Example: (compiled with g++-4.5.0)
g++ -O3 -fcx-fortran-rules -fprefetch-loop-arrays -mtune=core2
-march=core2 -mssse3 -S -c ../test_loop.cpp

The code for a complex multiplication loop done this way:

void f(std::complex<float> *a, std::complex<float> *b, std::complex<float> *r)
{
        for(std::size_t s=0; s<N; s++)
                r[s] = a[s]*b[s];
}

Is generated two-fold, one vectorized (.L3) and one not (L5):


It's certainly hard to guess the effect of pre-fetching only in the remainder loop (early 32-bit pentium4 style?). As you have set -mtune=core2, it seems reasonable the compiler would not optimize for Athlon-32, which may have been the most recent common CPU without effective hardware prefetch for vectorized loops. I don't really expect gcc to attempt further optimization specific to -mssse3, now that it's about 2 years out of production.

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux