On Tue, 14 Aug 2007, Daniel Berlin wrote:
I'm trying to get more of my code to vectorize, but very few of the loops
do. ICC (sorry for mentioning it, don't mean to offend anyone) manages to
vectorize all of the loops in my code, so it is doable (and succeeding
seems to yield a speed-up of 5-10x!).
Is there a way to help out the vectorizer with pragmas and suchlike, to
resolve possible misperceived vector dependence issues?
For starters, it would help to tell us which version of GCC you are
trying with.
4.1.0 and 4.2.1 both produce pretty much identical results.
Second, your code doesn't compile as you've written it, so i can't
diagnose more without you fixing it :)
OK, attached a compilable version. Compiler says:
test.cxx:15: note: not vectorized: can't determine dependence between
this_10->D.2134.Curve[i.0_30] and this_10->D.2134.Curve[i.0_30]
test.cxx:23: note: not vectorized: unsupported use in stmt.
test.cxx:32: note: not vectorized: unsupported use in stmt.
Third, even with some modifications to make it compile, your class
isn't used, so we wouldn't bother to compile it.
It compiles now, as per the attached code file. I would have thought the
code to demonstrate that a one liner loop doesn't vectorize should be
enough to diagnose a problem with the vectorizer. I'm not asking for help
with debugging my code, I'm asking for advice on how to make a fairly
obviously vectorizable loop vectorize.
Thanks.
Gordan
#include "test.h"
float CurveFit::Foo ()
{
static float BestFit[4];
static unsigned int i;
static unsigned int x;
static float xx;
static unsigned int LocalDataC;
static float CurrentError;
CurrentError = 0;
LocalDataC = DataC;
// This should vectorize but doesn't
for (i = 0; i < 4; i++)
Curve[i] += BestFit[i];
// ...
static float CacheX[1024];
// This should vectorize but doesn't
for (x = 0, xx = 0; x < LocalDataC; x++)
CacheX[x] = a * xx++;
// ...
static float Temp;
static float CacheParam[1024];
// This should vectorize but doesn't
for (x = 0; x < LocalDataC; x++)
{
Temp = DataV[x] - CacheParam[x] - Curve[3];
CurrentError += (Temp * Temp);
}
// ...
return CurrentError;
}
int main ()
{
CurveFit *test;
test = new CurveFit;
}
class CurveFit
{
public:
union
{
float Curve[4];
struct
{
float a;
float b;
float c;
float d;
};
};
unsigned int DataC;
float DataV[1024];
float Foo ();
};