This is just for broadwell, but if the processor is skylake or some newer type, it can't full use of the processor new microcode. Does we have better solution for this kind of use case. It seems intel ICC provide this kind of solution. 2018-05-08 3:26 GMT+08:00 Jonathan Wakely <jwakely.gcc@xxxxxxxxx>: > On 7 May 2018 at 11:46, Xi Ruoyao wrote: >> On 2018-05-07 18:30 +0800, Feng Longda wrote: >>> If we set -march/tune to define processor, we can obtain some extra >>> performance improvement. >>> >>> I want to run my application on different intel processor family, for >>> example: broadwell, skylake, haswell, skylake-avx512. >>> >>> What should I set? Does It like the following ? >>> >>> -march=broadwell -mtune=intel -mmmx -msse -msse -msse2 -msse3 -mavx >>> -mavx2 -mavx=512f -mavx512pf ... >> >> No. Then GCC would use AVX-512, and the program can't run on processors >> without AVX-512. (You'll see "Illegal Instruction - Core Dumped"). >> >> You should use target_clone attribute. >> c.f. https://gcc.gnu.org/onlinedocs/gcc-8.1.0/gcc/Common-Function-Attributes.html > > Or just use -march=broadwell and see if the performance is good enough.