Re: [4.4] Strange performance regression?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ian Lance Taylor wrote:
francesco biscani <bluescarni@xxxxxxxxx> writes:

I'm experiencing a strange behaviour with GCC 4.4.1. Basically I have
some C++ mathematical code which gets a ~x2 performance drop if I
*remove* the following debug line from the code:


This message is not appropriate for the mailing list gcc@xxxxxxxxxxxx
It is appropriate for gcc-help@xxxxxxxxxxxx  Please take any followups
to gcc-help.  Thanks.


In my experience, a performance drop in a tight loop when you remove a
line of code means that your loop is extremely sensitive to cache line
boundaries.  It can be difficult to find the optimal code other than
by testing various command line options.  Options to particularly test
are -falign-loops, -falign-labels, and -falign-jumps.

That seems useful advice. The align- options could help the hot loops fit Loop Stream Detector criteria. If you set -funroll-loops, you may exceed the loop size which fits LSD on older CPUs, but you would often make the LSD unnecessary.

Also, be sure that you are using a -mtune option appropriate for the
processor on which you are running.  E.g., you mention Core2, so you
should be using -mtune=core2.
For the 64-bit compiler, the default may be better than core2, but for 32-bit you should be using at least -march=pentium-m. If you are using vectorizer, -mtune=barcelona could make a difference either way. How are you controlling which threads run on which cache, in case there are cache sharing considerations?



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux