Hello, I am going to benchmark the performance difference between the various x86 uarch levels. I will be using Phoronix Test Suite, which has some support for performing compiler and compile flag benchmarks. I am opposed to dropping support for older CPUs, but I will perform this test fairly. That's why I'm posting this advance notice before performing actual benchmarks. I am an Ubuntu user and I am concerned that Ubuntu may require x86_64-2 in the not so distant future. I will be performing this test on Ubuntu 20.04.2 with GCC 9.3.0 [1]. I am going to use selected tests from this Phoronix article [2], but I will exclude benchmarks that do return much performance difference when using the "-O1" and "-O3" compiler flags, as: - their build scripts may ignore the CFLAGS/CXXFLAGS variables, - they may use some assembly code or C asm intrinsics, - they may have separate SSE4/AVX code paths, - the compiler is unable to optimize the code much, due to its nature. These are the benchmarks that probably would benefit the most from compiling for different uarch levels, which should be taken into account when interpreting the results. My rough comparision between "-O1" and "-O3" are at [3]. So, I will use the following tests: pts/scimark2 (all tests) pts/john-the-ripper (all tests) pts/graphics-magick ("swirl", "resizing", "HWB Color space") pts/coremark pts/himeno pts/encode-flac pts/c-ray Greetings, Mateusz Jończyk [1] GCC 9.3 does not support -march=x86_64-v2 and so. I will use switches like -march=nehalem instead. [2] https://www.phoronix.com/scan.php?page=article&item=gcc-10900k-compiler [3] https://openbenchmarking.org/result/2103131-HA-DRAFTUARC92 ---------------------- Benchmark selection details: Page 2 from the Phoronix article: cryptopp - is compiling some code with the flag "-msse4.2", so skipping it. smhasher - same fftw - has some kernels that explicitly use AVX / AVX2, point in benchmarking it, scimark - OK, I will also run some other tests from this benchmark - where the difference between "-O1" and "-O3" is nice, Page 3: TSCP - no real difference between "-O1" and "-O3" performance data, John The Ripper - small difference between "-O1" and "-O3", but leave it now GraphichMagick - OK, I'll choose tests with the biggest difference between "-O1" and "-O3": "swirl", "resizing", "HWB Color space", Page 4: AOM AV1 - no performance difference between "-O1" and "-O3", so skipping, x265 - patent encumbered format, skipping, Coremark - OK, "CoreMark Size 666 - Iterations Per Second" Himeno - OK, Stockfish - enables SSSE3 and SSE4.1 by default, leave it, FLAC Audio Encoding - OK, but probably the difference won't be big, Minion - tries to install many package dependencies, just leave it now, LevelDB - benchmark suite ignores the "-O1" flag, highly susceptible to non-quiet systems (so the results were bogus), GROMACS - same as Minion, additionally has long runtime IIRC, Darmstadt Automotive Parallel Heterogeneous Suite (daphne) - requires huge amounts of disk space, seems it will be IO-bound pgbench - I have still spinning HDDs, the benchmark was IO-bound NGiNX - it looks like it is testing both kernel and userspace, so leave it out, Additional benchmarks: n-queens - no difference between "-O1" and "-O3", build scripts seem to ignore CFLAGS, OpenSSL - no real difference between "-O1" and "-O3" c-ray - OK, let's include it