Re: Xeon twice the performance of opteron

"mark" <dvlhntr@xxxxxxxxx> · Thu, 17 Mar 2011 19:24:09 -0600

> -----Original Message-----
> From: pgsql-performance-owner@xxxxxxxxxxxxxx [mailto:pgsql-performance-
> owner@xxxxxxxxxxxxxx] On Behalf Of Jeff
> Sent: Thursday, March 17, 2011 9:14 AM
> To: pgsql-performance@xxxxxxxxxxxxxx
> Cc: Brian Ristuccia
> Subject:  Xeon twice the performance of opteron
> 
> hey folks,
> 
> Running into some odd performance issues between a few of our db
> boxes.  While trying to speed up a query I ran it on another box and
> it was twice as fast.  The plans are identical and various portions of
> the query run in the same amount of time - it all boils down to most
> of the time being spent in a join filter.  The plan is as good as it
> is going to get but the thing that is concerning me, which hopefully
> some folks here may have some insight on, is the very large difference
> in runtime.
> 
> three boxes:
> 	A: Intel(R) Xeon(R) CPU           E5345  @ 2.33GHz  (Runs query
> fastest)
> 		4MB cache
> 	B: Quad-Core AMD Opteron(tm) Processor 2352 (2.1GHZ) (Main
> production
> box, currently, middle speed)
> 		512k cache
> 	C: Quad-Core AMD Opteron(tm) Processor 2378 (2.4GHZ)
> 		512k cache
> 
> A & B are running PG 8.4.2 (yes, I know it desperately need to be
> upgraded). C was also on 8.4.2 and since it was not in production I
> upgraded it to 8.4.7 and got the same performance as 8.4.2.  Dataset
> on A & B is the same C is mostly the same, but is missing a couple
> weeks of data (but since this query runs over 3 years of data, it is
> negligable - plus C runs the slowest!)
> 
> All three running FC10 with kernel Linux db06
> 2.6.27.19-170.2.35.fc10.x86_64 #1 SMP Mon Feb 23 13:00:23 EST 2009
> x86_64 x86_64 x86_64 GNU/Linux
> 
> Load is very low on each box. The query is running from shared_buffers
> - no real IO is occuring.
> 
> The average timing for the query in question is 90ms on A,  180ms on B
> and 190ms on C.
> 
> Now here's where some odd stuff starts piling up: explain analyze
> overhead on said queries:
> 20ms on A, 50ms on B and 85ms on C(!!)
> 
> We had one thought about potential NUMA issues, but doing a series
> (100) of connect, query, disconnect and looking at the timings reveals
> them all to be solid... but even still we wouldn't expect it to be
> that awful.  The smaller cache of the opterons is also a valid
> argument.
> 
> I know we're running an old kernel, I'm tempted to upgrade to see what
> will happen, but at the same time I'm afraid it'll upgrade to a kernel
> with a broken [insert major subsystem here] which has happened before.
> 
> Anybody have some insight into this or run into this before?
> 
> btw, little more background on the query:
> 
>     ->  Nested Loop  (cost=5.87..2763.69 rows=9943 width=0) (actual
> time=0.571..2
> 74.750 rows=766 loops=1)
>           Join Filter: (ce.eventdate >= (md.date - '6 days'::interval))
>           ->  Nested Loop  (cost=5.87..1717.98 rows=27 width=8)
> (actual time=0.53
> 3..8.301 rows=159 loops=1)
> 		[stuff removed here]
>          ->  Index Scan using  xxxxxxx_date_idx on xxxxxx md
> (cost=0.00..19.50 rows=1099 width=8) (actual time=0.023..0.729
> rows=951 loops=15
> 9)
>                 Index Cond: (ce.eventdate <= md.date)
> 
> 
> On all three boxes that inner nestloop completes in about the same
> amount of time - it is that join filter that is causing the pain and
> agony. (If you are noticing the timing differences, that is because
> the numbers above are the actual numbers, not explain analyze).  The
> query is pulling up a rolling window of events that occured on a
> specific date. This query pulls up al the data for a period of time.
> ce.eventdate is indexed, and is used in the outer nestloop.  Thinking
> more about what is going on cache thrashing is certainly a possibility.
> 
> the amazing explain analyze overhead is also very curious - we all
> know it adds overhead, but 85ms?  Yow.
> 
> --
> Jeff Trout <jeff@xxxxxxxxxxxxx>
> http://www.stuarthamm.net/
> http://www.dellsmartexitin.com/

I am sure you might have already checked for this, but just incase...
Did you verify that no power savings stuff is turned on in the BIOS or at
the kernel ?

I have to set ours to something HP calls static high performance or
something like that if I want boxes that are normally pretty idle to execute
in a predictable fashion for sub second queries. 

I assume you checked with a steam benchmark results on the AMD machines to
make sure they are getting in the ballpark of where they are supposed to ? 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance