You aren't going to even scratch the surface of a 6x slowdown by blindly
tweaking compiler switches. You should try to figure out where the time
is getting spent.
Oprofile might be the effective tool to find out where the time is spent.
I think I can deduce from details in your post that you are using 32 bit
x86 architecture (rather than x86-64) but I'm not certain. To be clear,
are you running a 32 bit kernel or a 64 bit kernel?
I also don't feel like looking up specs on the Dell T5400 (though I have
some of them here) nor the T5500. How many physical CPUs per
motherboard (I think you said 2). How many cores per physical CPU? How
much ram? What kind/speed of ram?
How is you application multi threaded? Is it making decent use of
however many cores you have?
Maybe there will be some appropriate compiler switches and/or compiler
upgrade to get a little more performance after you have understood and
fixed the main problem. But at this point, you're looking in the wrong
place. I don't know whether you have a filesystem or other system
configuration level issue. There is some chance your algorithm was very
well tuned to exactly the cache structure of the previous cpu. But I
doubt it. I don't think the CPU or cache change explains the 6x
performance loss. Without a lot more info, I wouldn't even know where
to look to find the real issue. But I know compiler optimization
switches wouldn't make the list of things I'd even consider at this point.
Brian McGrew wrote:
Out application is very processor and disk I/O intensive and it runs about
6x slower on the newer hardware versus the old.