Re: "unhandled use" in vectorizing a dot product?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Benjamin Redelings I wrote:



2. Interestingly, the following is recognized WITHOUT -ffast-math:

  for(i=0;i<argc;i++)
    f4[i] += f1[i]*f2[i]*f3[i];

That's not a reduction; re-association from strict C standard order isn't required to vectorize it. icc makes a similar distinction, this should vectorize when icc -fp-model source is set (as well as not set), for example.
If I change this to the following, then it needs -ffast-math:

  for(i=0;i<argc;i++)
    sum += f1[i]*f2[i]*f3[i];

This is essentially doing the first thing, plus also summing the f4[i]. I guess that is the problem?
Yes, vectorization involves at least 4 parallel sums (for float data type), adding the partials at the end, with numerically different result from the non-vector case (often, but not always, slightly more accurate). Also, possibly varying slightly with alignment, and possibly differing according to whether -msse3 is set.


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux