On Tue, 11 Apr 2017, Yifei wrote: > I'm wondering why gcc fails to do a straightforward optimization? And how can > I do, as a work around, to avoid writing an explicit loop? (note, the example is vectorized with -mveclibabi={acml,svml}, but then it requires the corresponding external library and -ffast-math) Without -mveclibabi, this works for loop vectorization with recent Glibc, because it provides vectorized implementations in libmvec and has __attribute__((simd)) on a few math functions. However, using SIMD clones in SLP vectorization is currently not implemented (veclib calls and SIMD clones are handled via different paths) Alexander