> On Tue, 14 Aug 2007, Dorit Nuzman wrote: > > >> In this case, 4.3 will vectorize the loop on 15. > >> > >> The others are just too complex of reduction patterns right now, it > >> looks like. > >> > >> Feel free to file a missed optimization bug on it :) > > > > Actually there's already a PR for it - PR32824. I'm getting more and more > > testcases where this pattern occurs... I hope the generic reduction > > detection will be ready in the near future... > > Another problem seems to be that there seems to be little effort on part > of GCC to automatically align arrays to a 16-byte boundary - and there's > no excuse for that in case of static ones. (Unless, of course, I'm > misinterpreting the vectorizer report "vectorizing unaligned access".) > Vectorizing operations on unaligned arrays is cosiderably less efficient. > we do force the alignment of static arrays (e.g. BestFit in your code example). The reason we end up generating unaligned accesses to it is that GCC can't force the alignment of Curve, and since GCC currently doesn't support vectorization of misaligned stores, what it does is peel a few iterations from the loop until we reach an aligned address, at which point we'll enter the loop. While this aligns the store, at the same time it makes the load from BestFir unaligned... (but vectorizing unaligned loads is something GCC knows how to do). > On a separate note, why is (float * float) getting transformed to a > powf() call (according to the vectorizer report, again), when multiplying > seems to be faster for low powers? > The vectorizer actually detects the power-of-2 pattern and replaces it with a multiplication: test.cpp:32: note: pattern recognized: Temp.8_29 * Temp.8_29 (but this is only for the purpose of vectorization; if vectorization doesn't take place, the pow stays in tact) > And is there an eta on a vectorizable sinf()? > There is initial support to vectorize calls to math functions, given a vectorized math lib. I'm not sure what's the status of that. Richard may know. dorit > Gordan