Re: gcc 3.4.3: -march optimization for Intel Core2Duo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Timothy C Prince wrote:

-----Original Message-----
From: Vladimir Makarov <vmakarov@xxxxxxxxxx>
To: Ian Lance Taylor <iant@xxxxxxxxxx>
Date: Thu, 05 Oct 2006 12:08:28 -0400
Subject: Re: gcc 3.4.3:  -march optimization for Intel Core2Duo

Ian Lance Taylor wrote:

"Jan Dillmann" <jan.dillmann@xxxxxxxxxx> writes:



we are running several benchmarks (SpecCPU200...) on 32-bit linux-systems and are able to set an optimization-parameter
for '-march'. We use Intel Core2uo-CPUs. Which parameter should we use (nocona, prescott...) ?
gcc 3.4.3 has no specific tuning for Core2 Duo, if for no other reason
than the release was made before the processors became available.  My
guess would be that you will get the best results with -mtune=nocona.
But it is only a guess.



I believe that pentium-m will work better. Nocona (a x86_64 processor) is based on northwood/prescott core which is a high frequency core with long pipelines. Core2 Duo is closer to pentium M (lower frequency core
with much shorter pipelines).  Although usage of penium-m will result a
bigger code in comparison with nocona because aligning loop/function
will be forced (nortwood core is not so sensitive to aligning therfore
aligining is not done when -mtune=nocona is used).  I don't remember
Intel recomendation about aligning code for Core Duo (probably it is the
same as for penium M).


________________________________

FWIW, pentium-m is optimized by using 387 code for nearly everything except (int) casts. This is because of the Banias SSE decoder bottleneck. If you use -march=pentium-m, you would add -fpmath=sse to attempt to get code more optimum for any CPU other than Banias/Dothan.
OP question was about Core 2 Duo, a more advanced  (64-bit capable) CPU than Core Duo.
Sorry, that was a typo. When I wrote this I really meant Core 2 Duo (not Core Duo) in 32 bit environment. Pentium-m tunning is not ideal (not sse only but for example maximal number of issued insns per cycle). To be more accurate I believe it will work better for SPEC2000. Some tunning for Core 2 Duo may be found on yara-branch (it is called woodcrest there).



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux