On Mon, Oct 23, 2023, Gerrit Slomma wrote: > Compilation with "gcc -mavx -i avx2 avx2.c" fails, due to used intrinsics > are AVX2-intrinsics. > When compiled with "gcc -mavx2 -o avx2 avx2.c" an run on a E7-4880v2 this > yields "illegal instruction". > When run on a KVM-virtualized "Sandy Bridge"-CPU, but the underlying CPU is > capable of AVX2 (i.e. Haswell or Skylake) this runs, despite advertised flag > is only avx: This is expected. Many AVX instructions have virtualization holes, i.e. hardware doesn't provide controls that allow the hypervisor (KVM) to precisely disable (or intercept) specific sets of AVX instructions. The virtualization holes are "safe" because the instructions don't grant access to novel CPU state, just new ways of manipulating existing state. E.g. AVX2 instructions operate on existing AVX state (YMM registers). AVX512 on the other hand does introduce new state (ZMM registers) and so hardware provides a control (XCR0.AVX512) that KVM can use to prevent the guest from accessing the new state. In other words, a misbehaving guest that ignores CPUID can hose itself, e.g. if the VM gets live migrated to a host that _doesn't_ natively support AVX2, then the workload will suddenly start getting #UDs. But the integrity of the host and the VM's state is not in danger. > $ ./avx2 > [0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1 > [0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1 > [0] 16 [1] 14 [2] 12 [3] 10 [4] 8 [5] 6 [6] 4 [7] 2 > [0] 128 [1] 98 [2] 72 [3] 50 [4] 32 [5] 18 [6] 8 [7] 2 > > this holds for FMA3-instructions (i used intrinsic is > _mm256_fmadd_pd(a,b,c).) > > When i emulate the CPU as Westmere it yields "illegal instruction". This is also expected. Westmere doesn't support AVX, and so KVM disallows the guest from setting XCR0.YMM. Buried in the "PROGRAMMING WITH INTEL® AVX, FMA, AND INTEL® AVX2" section of the SDM is this snippet: If YMM state management is not enabled by an operating systems, Intel AVX instructions will #UD regardless of CPUID.1:ECX.AVX[bit 28]. I.e. Westmere doesn't have an AVX2 virtualization hole because it doesn't support AVX in the first place.