Hello, I'm compiling a program which will be distributed in binary-form (source code is not open at this time). I am trying to pick the "right" optimization flags. The program is written in C++ but most of the execution time is spent in a C library. (I'm using gcc 4.8.4 on amd64-linux-gnu platform.) I will test -O2 -O3 and -Os I think -fomit-frame-pointer also helps sometimes? (It might already be enabled for -O{2,3,s} on amd64?) I also wanted to specify -march because I think it may allow gcc to use SSE2. (Although SSE2 may be enabled by default on amd64?) https://gcc.gnu.org/onlinedocs/gcc-4.8.4/gcc/i386-and-x86-64-Options.html I've tried -march=core2 but I'm not sure older AMD chips support the same extensions (SSSE3 for example). So maybe -march=core2 -mno-ssse3 ? But after reading the documentation more closely, it appears there is an option addressing my use-case: -mtune=generic (and in fact, it looks like Ubuntu's gcc was compiled with --with-tune=generic so this should be the default, IIUC). > Produce code optimized for the most common IA32/AMD64/EM64T > processors. If you know the CPU on which your code will run, then you > should use the corresponding -mtune or -march option instead of > -mtune=generic. But, if you do not know exactly what CPU users of > your application will have, then you should use this option. > > As new processors are deployed in the marketplace, the behavior of > this option will change. Therefore, if you upgrade to a newer version > of GCC, code generation controlled by this option will change to > reflect the processors that are most common at the time that version > of GCC is released. Is it documented somewhere how -mtune=generic has changed over releases 4.7, 4.8, 4.9, 5.0 (I'd like to get a feel for what CPUs are targeted). Regards.