Re: how to build a faster gcc working on a specific project

"U.Mutlu" <for-gmane@xxxxxxxxxxx> · Thu, 28 Jun 2018 14:58:09 +0200

U.Mutlu wrote on 06/28/2018 02:33 PM:
问 题 wrote on 06/27/2018 05:33 AM:
Hi,

I am working on a project with millions of LOC, which needs really long time
to compile. I know gcc has a technique called profile guided
optimization(PGO) which makes the compiled program run faster by using some
training data. So I am wondering if I can use my project to train gcc while
compiling gcc itself. I know gcc provides a make option 'make
profiledbootstrap' to train the compiler,  but how can I use my own project
to train gcc?

I do know some other techniques like distcc and ccache to speed up
compiling, but here I want to focus on gcc.

Tony

���� Outlook<http://aka.ms/weboutlook>

Let me summarize: you have 3 problem areas you want to optimize:
   1) build time of the compiler,

Actually IMO 1) should rather mean
     1) run time of the compiler, ie. fast compilation of the application 
project(s)

   2) build time of your application project,
   3) run time of your application project.

Not sure if you can use your own project as training data while building the
compiler, but I would suggest the following:

1) Build the new compiler

   - by disabling debugging info (-g0 -DNDEBUG) in all *FLAGS, ie.
       CFLAGS
       CPPFLAGS
       CXXFLAGS
       CFLAGS_FOR_BUILD
       CPPFLAGS_FOR_BUILD
       CXXFLAGS_FOR_BUILD
       CFLAGS_FOR_TARGET
       CPPFLAGS_FOR_TARGET
       CXXFLAGS_FOR_TARGET

   - with -pipe and additional options for optimizations, like
       -Ofast
       -DCLS=$(getconf LEVEL1_DCACHE_LINESIZE)
       -fpic
       -floop-nest-optimize
       --param simultaneous-prefetches=16
       -fprefetch-loop-arrays
       -msse4.2
       -mrecip=all
       -funroll-loops
       -fdelete-null-pointer-checks
       --param prefetch-latency=32
       -ffast-math
       -ftree-vectorize
       -funsafe-math-optimizations
   (where applicable for your CPU & OS; if your OS is Linux see cat
/proc/cpuinfo for supported CPU features)

   - with --disable-bootstrap as then the build goes much faster (12 minutes
vs. 111 minutes here)

   - use make -j, or in a script use twice the number of CPU cores for # of
parallel make jobs, ie.:
     nThr=$(getconf _NPROCESSORS_ONLN) ; nThr=$(( $nThr * 2 )) ; make -j $nThr

2) Repeat step 1 by now building the new compiler by itself :-) instead of by
the (old) system compiler that was used in step 1, by setting the following
   vars pointing to the new compiler:
../gcc_trunk/configure -v \
     ...
     CC="$my_CC" \
     GCC="$my_GCC" \
     CXX="$my_CXX" \
     \
     CC_FOR_BUILD="$my_CC_FOR_BUILD" \
     GCC_FOR_BUILD="$my_GCC_FOR_BUILD" \
     CXX_FOR_BUILD="$my_CXX_FOR_BUILD" \
     \
     CC_FOR_TARGET="$my_CC_FOR_TARGET" \
     GCC_FOR_TARGET="$my_GCC_FOR_TARGET" \
     CXX_FOR_TARGET="$my_CXX_FOR_TARGET" \
     \
     ...

That way, after generating a fast compiler you can then optimize
your own project. Possibly you can use most of the above options,
and as you mentioned also PGO (I've no experience yet with that,
but it sounds very promising), and also using the precompiled headers
feature (but this requires a careful reorganisation of the #include statements
in the source files; best is to put them all into an own pch.hpp file and
include that as the very first statement in the source file ...).