On 31/08/2015 15:09, Tim Prince wrote: > On 8/31/2015 7:29 AM, Mason wrote: > >> I think -fomit-frame-pointer also helps sometimes? >> (It might already be enabled for -O{2,3,s} on amd64?) >> >> I also wanted to specify -march because I think it may allow gcc to >> use SSE2. (Although SSE2 may be enabled by default on amd64?) >> >> https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gcc/i386-and-x86-64-Options.html >> >> I've tried -march=core2 but I'm not sure older AMD chips support the same >> extensions (SSSE3 for example). >> >> So maybe -march=core2 -mno-ssse3 ? >> >> But after reading the documentation more closely, it appears there is >> an option addressing my use-case: -mtune=generic (and in fact, it looks >> like Ubuntu's gcc was compiled with --with-tune=generic so this should >> be the default, IIUC). > > Some of the docs say mtune=generic is default. march= implying sse2 > also is a default for AMD64, and all CPUs of the last decade support > SSE3 (but not SSSE3, which is a relatively unimportant option anyway). > Are you forgetting that mtune doesn't pick instruction set, so without > march= you will get only the default (SSE2 for 64-bit mode)? sse3 is > quite important for complex arithmetic, which you didn't indicate is > relevant, so why all the fuss? There's still some confusion in my mind. Someone please correct me if I'm wrong. -march=X specifies the EXACT micro-architecture to optimize for. This option will 1) enable or disable instruction set extensions (MMX, SSE, etc) according to what is available on uarch X. 2) generate code optimized for uarch X (instruction selection, scheduling, etc) -mtune=X will only perform item 2) i.e. without using instructions from instruction-set extensions. However, IIUC, specific targets can define instruction-set extensions by default; e.g. amd64 defines MMX, SSE, SSE2, ... what else? So if I want to support a broad range of x86-64 uarches, all I need is -mtune=generic and the optimizer will use SSE2 where appropriate. Now suppose I decide to make an -m32 build to support older CPUs e.g. all the way back to P3. As far as I understand, the i386 target doesn't specify any instruction-set extensions? Does that mean I have to list them explicitly? -m32 -mtune=generic -mmmx -msse -msse2 Is there a way to say: "Use any instruction available for this uarch, but optimize code for this other uarch / class of uarches (mtune=generic)" ?? Basically, march does 1) and 2), mtune does 2) Is there a way to do just 1) independently? Regards