Re: what optimization can be expected?

Burlen Loring <burlen.loring@xxxxxxxxx> · Fri, 24 Apr 2009 08:24:04 -0400

Tim Prince wrote:
burlen wrote:

Can loops with a non-unit stride be automagically optimized by compiler
with SSE?

template <int nComp>
void norm(double *result, double *data, size_t n)
{
 double *pDat=data;
 double *pRes=result;

 for (size_t i=0; i<n; ++i)
 {
   *pRes=*pDat**pDat;
   for (int j=1; j<nComp; ++j)
   {
     *pRes+=pDat[j]*pDat[j];
   }
   *pRes=sqrt(*pRes);

   pRes+=1;
   pDat+=nComp;
 }
}

Your inner loop appears to have unit stride, and might be optimized easily
if you didn't write it with potential aliases.  If you meant
inner_product(), why not use that?

Inner loop does have unit stride but its usually small between 1 and 12 
and the outer loop is usually large in the 10-100s of thousands. That 
example is simply one simple situation that I encounter. I want to 
understand how the compiler applies SSE optimization. What can be 
automatically SSE optimized by g++? Is this documented somewhere?

I want to write in such a way to take advantage of g++ capability. It's 
important for me to let g++ do optimization because the code needs to be 
cross platform.

I know gmail is fashionable, but there's plenty of reason for it going in
the spam box, and no effort at google to improve the situation.

Sorry but that's all I've got at the moment.

Thanks

Burlen