Another possible reason is that the CPU I'm using ( AMD Phenom II ) does not fully support sse4.1 which consists of many more integer instructions. Reference can be found at wiki's SSE4 entry : http://en.wikipedia.org/wiki/SSE4 " AMD currently only supports 4 instructions from the SSE4 instruction set, but have also added two new SSE instructions that it named SSE4a. These instructions are not found in Intel's processors supporting SSE4.1 and alternatively AMD processors aren't supporting Intel's SSE4.1. Support was added for SSE4a for unaligned SSE load-operation instructions (which formerly required 16-byte alignment). " Best regards, LC On Thu, Aug 20, 2009 at 1:05 PM, Lingchuan (LC) Meng<lingchuanmeng@xxxxxxxxx> wrote: > Hi Brian, > > Thank you! Here's the header files included: > > #include <xmmintrin.h> > #include <emmintrin.h> > #include <smmintrin.h> > > The compilation is okay, no error message. It's during execution that > I see the "Illegal instruction" error message. > > Best regards, > > LC > > > > On Thu, Aug 20, 2009 at 1:01 PM, Brian Budge<brian.budge@xxxxxxxxx> wrote: >> Hi LC - >> >> Do you have the correct header files included? You need the >> *mmintrin.h family of headers for these functions to work. >> >> Brian >> >> On Thu, Aug 20, 2009 at 9:09 AM, Lingchuan (LC) >> Meng<lingchuanmeng@xxxxxxxxx> wrote: >>> Hi all, >>> >>> When vectorizing some of my integer code, I had this "Illegal >>> instruction" problem with SIMD intrinsics in GCC. I believe the error >>> is triggered by the instruction: >>> >>> _mm_store_si128((__m128i *)c, v2); >>> >>> where C and v2 are defined as: >>> >>> int c[4] __attribute__((aligned(16))) = {0, 0, 0, 0}; >>> __m128i v2; >>> >>> And the CFLAGS is >>> >>> -I/usr/local/papi-3.6.1/include -L/usr/local/papi-3.6.1/lib/ -lpapi >>> -lm -O2 -msse4.1 -fno-inline-small-functions >>> >>> I searched around when starting with SIMD intrinsics, and didn't find >>> a complete instruction set for GCC. So I end up using something >>> similar from MSDN. >>> >>> What's the legal intrinsic function in GCC to store a __m128i vector >>> back to an integer array? >>> >>> Thanks, >>> >>> LC >>> >> >