Hi,
We are building an open source public transport journey planner in C [1].
This adventure has brought us to some great compiler problems and
differences between gcc and clang. At the moment of writing gcc 4.9.2 with
LTO gives us the best performance.
While optimising our code we have the following issue; while iterating
over 3 servicedays and all trips, we observed a return statement
(or goto) consistently slowing down our code with 0.4s on 1000 requests,
the while the improvement for clang is 1.4s, and is for some cases even as
big as 7s.
Some observations and benchmarks;
gcc -O3 w/out return; 14.8s
gcc -O3 with return; 15.2s
clang -O3 with return; 16.7s
clang -O3 w/out return; 18.1s
gcc -O2 with return; 18.6s
gcc -O2 w/out return; 19.5s
gcc -O1 with return; 21.5s
gcc -O1 w/out return; 22.5s
The return found in the commit below is to stop iterating after
the first, and given a sorted set of trips, best trip is found.
Alternative methods to stop iterating, such as adding an extra boolean
condition to both for-loops increases the benchmark time by 2s.
<http://stefan.konink.de/rrrr/with.i>
<http://stefan.konink.de/rrrr/without.i>
Line 4673 is the difference.
Is there any explanation for the behavior? We expected that GCC like
Clang, and at all other optimisation levels would have improved on an
early termination of the search.
The compiling was done using:
Using built-in specs.
COLLECT_GCC=/usr/x86_64-pc-linux-gnu/gcc-bin/4.9.2/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.2/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-4.9.2/work/gcc-4.9.2/configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.9.2
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.9.2
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.9.2/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.9.2/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/include/g++-v4
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/4.9.2/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls
--without-included-gettext --enable-checking=release
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.9.2
p1.0, pie-0.6.2' --enable-libstdcxx-time --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
--disable-multilib --with-multilib-list=m64 --disable-altivec
--disable-fixed-point --enable-targets=all --disable-libgcj
--enable-libgomp --disable-libmudflap --disable-libssp --enable-lto
--without-cloog --enable-libsanitizer
Thread model: posix
gcc version 4.9.2 (Gentoo 4.9.2 p1.0, pie-0.6.2)
Linux medion.thuis.konink.de 3.17.1-gentoo-r1 #1 SMP PREEMPT Fri Oct 24
00:48:56 CEST 2014 x86_64 Intel(R) Core(TM) i5-2380P CPU @ 3.10GHz
GenuineIntel GNU/Linux
Stefan
[1]
https://github.com/bliksemlabs/rrrr/commit/025f99bb7337e956a15f8a35703d284b526a91f3