Hello
I came upon following behaviour, i think this is a bug, but where to
file it?
I filed it against qemu-kvm at Red Hat-jira for the time being, but this
is a closed environment as it seems.
Sourcecode first:
#include <stdio.h>
#include <string.h>
#include <immintrin.h>
void main(void) {
__m256i test1 = _mm256_set_epi32(1,2,3,4,5,6,7,8);
__m256i test2 = _mm256_set_epi32(1,2,3,4,5,6,7,8);
for (int count = 0; count < 8; count++) {
printf("[%d] %d ", count, *((int*)(&test1) + count));
}
printf("\n");
for (int count = 0; count < 8; count++) {
printf("[%d] %d ", count, *((int*)(&test2) + count));
}
printf("\n");
test1 = _mm256_add_epi32(test1, test2);
test2 = _mm256_mullo_epi32(test1, test2);
for (int count = 0; count < 8; count++) {
printf("[%d] %d ", count, *((int*)(&test1) + count));
}
printf("\n");
for (int count = 0; count < 8; count++) {
printf("[%d] %d ", count, *((int*)(&test2) + count));
}
printf("\n");
}
Compilation with "gcc -mavx -i avx2 avx2.c" fails, due to used
intrinsics are AVX2-intrinsics.
When compiled with "gcc -mavx2 -o avx2 avx2.c" an run on a E7-4880v2
this yields "illegal instruction".
When run on a KVM-virtualized "Sandy Bridge"-CPU, but the underlying CPU
is capable of AVX2 (i.e. Haswell or Skylake) this runs, despite
advertised flag is only avx:
$ ./avx2
[0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1
[0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1
[0] 16 [1] 14 [2] 12 [3] 10 [4] 8 [5] 6 [6] 4 [7] 2
[0] 128 [1] 98 [2] 72 [3] 50 [4] 32 [5] 18 [6] 8 [7] 2
this holds for FMA3-instructions (i used intrinsic is
_mm256_fmadd_pd(a,b,c).)
When i emulate the CPU as Westmere it yields "illegal instruction".
Regards, Gerrit