Re: Helping out the Vectorizer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 14 Aug 2007, Daniel Berlin wrote:

I'm trying to get more of my code to vectorize, but very few of the loops
do. ICC (sorry for mentioning it, don't mean to offend anyone) manages to
vectorize all of the loops in my code, so it is doable (and succeeding
seems to yield a speed-up of 5-10x!).

Is there a way to help out the vectorizer with pragmas and suchlike, to
resolve possible misperceived vector dependence issues?


For starters, it would help to tell us which version of GCC you are trying with.

4.1.0 and 4.2.1 both produce pretty much identical results.

Second, your code doesn't compile as you've written it, so i can't
diagnose more without you fixing it :)

OK, attached a compilable version. Compiler says:

test.cxx:15: note: not vectorized: can't determine dependence between this_10->D.2134.Curve[i.0_30] and this_10->D.2134.Curve[i.0_30]
test.cxx:23: note: not vectorized: unsupported use in stmt.
test.cxx:32: note: not vectorized: unsupported use in stmt.

Third, even with some modifications to make it compile, your class
isn't used, so we wouldn't bother to compile it.

It compiles now, as per the attached code file. I would have thought the code to demonstrate that a one liner loop doesn't vectorize should be enough to diagnose a problem with the vectorizer. I'm not asking for help with debugging my code, I'm asking for advice on how to make a fairly obviously vectorizable loop vectorize.

Thanks.

Gordan
#include "test.h"
float CurveFit::Foo ()
{
        static float BestFit[4];
	static unsigned int i;
	static unsigned int x;
	static float xx;
	static unsigned int LocalDataC;
	static float CurrentError;

	CurrentError = 0;
	LocalDataC = DataC;

	// This should vectorize but doesn't
        for (i = 0; i < 4; i++)
                Curve[i] += BestFit[i];

        // ...

	static float CacheX[1024];

        // This should vectorize but doesn't
        for (x = 0, xx = 0; x < LocalDataC; x++)
                CacheX[x] = a * xx++;

        // ...

        static float Temp;
        static float CacheParam[1024];

	// This should vectorize but doesn't
        for (x = 0; x < LocalDataC; x++)
        {
		Temp = DataV[x] - CacheParam[x] - Curve[3];
                CurrentError += (Temp * Temp);
        }

	// ...

	return CurrentError;
}

int main ()
{
	CurveFit *test;
	test = new CurveFit;
}
class CurveFit
{
	public:
		union
		{
			float Curve[4];
			struct
			{
				float a;
				float b;
				float c;
				float d;
			};
		};

		unsigned int    DataC;
		float           DataV[1024];

		float Foo ();
};

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux