This is the output of the program attached with these arguments as provided on Windows 10 x86_64, and Cygwin x86_64. GNAT/2017 compiler also doesn't optimize with -O3 -mavx2, neither with -O3 alone (same output). I have Skylake avx2 intel core i3 PC, but actually SSE2/SSE3 switches should optimize this. Output: Bojan@DESKTOP-UT5MH6N ~ $ cd /cygdrive/c/Users/Bojan/Documents Bojan@DESKTOP-UT5MH6N /cygdrive/c/Users/Bojan/Documents $ gnatmake -O3 -march=skylake matrix_mul.adb -largs -s gcc -c -O3 -march=skylake matrix_mul.adb gnatbind -x matrix_mul.ali gnatlink matrix_mul.ali -O3 -march=skylake -s Bojan@DESKTOP-UT5MH6N /cygdrive/c/Users/Bojan/Documents $ ./matrix_mul.exe 4.00000E+01 3.60000E+01 3.20000E+01 2.80000E+01 8.00000E+01 7.20000E+01 6.40000E+01 5.60000E+01 1.20000E+02 1.08000E+02 9.60000E+01 8.40000E+01 1.60000E+02 1.44000E+02 1.28000E+02 1.12000E+02 Elapsed Time is 1.725664889 4.00000E+01 3.60000E+01 3.20000E+01 2.80000E+01 8.00000E+01 7.20000E+01 6.40000E+01 5.60000E+01 1.20000E+02 1.08000E+02 9.60000E+01 8.40000E+01 1.60000E+02 1.44000E+02 1.28000E+02 1.12000E+02 Elapsed time is 0.290578222 Bojan@DESKTOP-UT5MH6N /cygdrive/c/Users/Bojan/Documents $ gcc --version gcc (GCC) 6.4.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. C:\Users\Bojan\Documents>gnatmake -O3 -march=skylake matrix_mul.adb -largs -s gcc -c -O3 -march=skylake matrix_mul.adb gnatbind -x matrix_mul.ali gnatlink matrix_mul.ali -O3 -march=skylake -s C:\Users\Bojan\Documents>matrix_mul 4.00000E+01 3.60000E+01 3.20000E+01 2.80000E+01 8.00000E+01 7.20000E+01 6.40000E+01 5.60000E+01 1.20000E+02 1.08000E+02 9.60000E+01 8.40000E+01 1.60000E+02 1.44000E+02 1.28000E+02 1.12000E+02 Elapsed Time is 1.307566667 4.00000E+01 3.60000E+01 3.20000E+01 2.80000E+01 8.00000E+01 7.20000E+01 6.40000E+01 5.60000E+01 1.20000E+02 1.08000E+02 9.60000E+01 8.40000E+01 1.60000E+02 1.44000E+02 1.28000E+02 1.12000E+02 Elapsed time is 0.299596000 C:\Users\Bojan\Documents>gcc --version gcc (Rev1, Built by MSYS2 project) 7.2.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. This is unexpected behavior as these times should be similar. Problem with vectorizing/optimizing the code?
Attachment:
matrix_mul.adb
Description: Binary data