performance question with std::complex<float> in new g++ versions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all,

I'm stuck on some silly issue and I'm hoping there is a simple
solution to it. I have a piece of code that does nothing but
performing a very large number of products between std::complex<float>
values and some float values in a loop.
Using gcc-4.1.2 and gcc-4.2.4 my standard test case runs for about
7:25 minutes and 6:50 minutes on 3.0Ghz Penryn CPUs (single-threaded),
however when using gcc-4.3.4 or gcc-4.4.2 or even the svn version, my
run-time is > 40 minutes,
which is a serious drop in performance. For this test I reduced all
compiler options down to -O3 only. Now, I looked a bit at the assembly
code produced, and there is two things that are apparent, the gcc-4.3
and newer versions produce
assembly code about twice as long as the older gcc versions. Also,
gcc-4.1 and 4.2 write out all the multiplications in sse code, while
the 4.3 and newer call a routine named __mulsc3.
Has anybody ever encountered such a performance drop and knows whether
there is a compiler flag or something to get my performance back ?

Thank you,
Thomas Witzel

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux