-O3 return performance degradation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


We are building an open source public transport journey planner in C [1]. This adventure has brought us to some great compiler problems and differences between gcc and clang. At the moment of writing gcc 4.9.2 with LTO gives us the best performance.

While optimising our code we have the following issue; while iterating over 3 servicedays and all trips, we observed a return statement (or goto) consistently slowing down our code with 0.4s on 1000 requests, the while the improvement for clang is 1.4s, and is for some cases even as big as 7s.

Some observations and benchmarks;

gcc   -O3 w/out return; 14.8s
gcc   -O3 with  return; 15.2s
clang -O3 with  return; 16.7s
clang -O3 w/out return; 18.1s
gcc   -O2 with  return; 18.6s
gcc   -O2 w/out return; 19.5s
gcc   -O1 with  return; 21.5s
gcc   -O1 w/out return; 22.5s

The return found in the commit below is to stop iterating after the first, and given a sorted set of trips, best trip is found. Alternative methods to stop iterating, such as adding an extra boolean condition to both for-loops increases the benchmark time by 2s.

<http://stefan.konink.de/rrrr/with.i>
<http://stefan.konink.de/rrrr/without.i>

Line 4673 is the difference.

Is there any explanation for the behavior? We expected that GCC like Clang, and at all other optimisation levels would have improved on an early termination of the search.


The compiling was done using:
Using built-in specs.
COLLECT_GCC=/usr/x86_64-pc-linux-gnu/gcc-bin/4.9.2/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.2/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-4.9.2/work/gcc-4.9.2/configure --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.9.2 --includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/include --datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.9.2 --mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.9.2/man --infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.9.2/info --with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/include/g++-v4 --with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/4.9.2/python --enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt --disable-werror --with-system-zlib --enable-nls --without-included-gettext --enable-checking=release --with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.9.2 p1.0, pie-0.6.2' --enable-libstdcxx-time --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-multilib --with-multilib-list=m64 --disable-altivec --disable-fixed-point --enable-targets=all --disable-libgcj --enable-libgomp --disable-libmudflap --disable-libssp --enable-lto --without-cloog --enable-libsanitizer
Thread model: posix
gcc version 4.9.2 (Gentoo 4.9.2 p1.0, pie-0.6.2)

Linux medion.thuis.konink.de 3.17.1-gentoo-r1 #1 SMP PREEMPT Fri Oct 24 00:48:56 CEST 2014 x86_64 Intel(R) Core(TM) i5-2380P CPU @ 3.10GHz GenuineIntel GNU/Linux


Stefan


[1] https://github.com/bliksemlabs/rrrr/commit/025f99bb7337e956a15f8a35703d284b526a91f3



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux