> -----Original Message----- > From: pgsql-performance-owner@xxxxxxxxxxxxxx [mailto:pgsql-performance- > owner@xxxxxxxxxxxxxx] On Behalf Of Jeff > Sent: Thursday, March 17, 2011 9:14 AM > To: pgsql-performance@xxxxxxxxxxxxxx > Cc: Brian Ristuccia > Subject: Xeon twice the performance of opteron > > hey folks, > > Running into some odd performance issues between a few of our db > boxes. While trying to speed up a query I ran it on another box and > it was twice as fast. The plans are identical and various portions of > the query run in the same amount of time - it all boils down to most > of the time being spent in a join filter. The plan is as good as it > is going to get but the thing that is concerning me, which hopefully > some folks here may have some insight on, is the very large difference > in runtime. > > three boxes: > A: Intel(R) Xeon(R) CPU E5345 @ 2.33GHz (Runs query > fastest) > 4MB cache > B: Quad-Core AMD Opteron(tm) Processor 2352 (2.1GHZ) (Main > production > box, currently, middle speed) > 512k cache > C: Quad-Core AMD Opteron(tm) Processor 2378 (2.4GHZ) > 512k cache > > A & B are running PG 8.4.2 (yes, I know it desperately need to be > upgraded). C was also on 8.4.2 and since it was not in production I > upgraded it to 8.4.7 and got the same performance as 8.4.2. Dataset > on A & B is the same C is mostly the same, but is missing a couple > weeks of data (but since this query runs over 3 years of data, it is > negligable - plus C runs the slowest!) > > All three running FC10 with kernel Linux db06 > 2.6.27.19-170.2.35.fc10.x86_64 #1 SMP Mon Feb 23 13:00:23 EST 2009 > x86_64 x86_64 x86_64 GNU/Linux > > Load is very low on each box. The query is running from shared_buffers > - no real IO is occuring. > > The average timing for the query in question is 90ms on A, 180ms on B > and 190ms on C. > > Now here's where some odd stuff starts piling up: explain analyze > overhead on said queries: > 20ms on A, 50ms on B and 85ms on C(!!) > > We had one thought about potential NUMA issues, but doing a series > (100) of connect, query, disconnect and looking at the timings reveals > them all to be solid... but even still we wouldn't expect it to be > that awful. The smaller cache of the opterons is also a valid > argument. > > I know we're running an old kernel, I'm tempted to upgrade to see what > will happen, but at the same time I'm afraid it'll upgrade to a kernel > with a broken [insert major subsystem here] which has happened before. > > Anybody have some insight into this or run into this before? > > btw, little more background on the query: > > -> Nested Loop (cost=5.87..2763.69 rows=9943 width=0) (actual > time=0.571..2 > 74.750 rows=766 loops=1) > Join Filter: (ce.eventdate >= (md.date - '6 days'::interval)) > -> Nested Loop (cost=5.87..1717.98 rows=27 width=8) > (actual time=0.53 > 3..8.301 rows=159 loops=1) > [stuff removed here] > -> Index Scan using xxxxxxx_date_idx on xxxxxx md > (cost=0.00..19.50 rows=1099 width=8) (actual time=0.023..0.729 > rows=951 loops=15 > 9) > Index Cond: (ce.eventdate <= md.date) > > > On all three boxes that inner nestloop completes in about the same > amount of time - it is that join filter that is causing the pain and > agony. (If you are noticing the timing differences, that is because > the numbers above are the actual numbers, not explain analyze). The > query is pulling up a rolling window of events that occured on a > specific date. This query pulls up al the data for a period of time. > ce.eventdate is indexed, and is used in the outer nestloop. Thinking > more about what is going on cache thrashing is certainly a possibility. > > the amazing explain analyze overhead is also very curious - we all > know it adds overhead, but 85ms? Yow. > > -- > Jeff Trout <jeff@xxxxxxxxxxxxx> > http://www.stuarthamm.net/ > http://www.dellsmartexitin.com/ I am sure you might have already checked for this, but just incase... Did you verify that no power savings stuff is turned on in the BIOS or at the kernel ? I have to set ours to something HP calls static high performance or something like that if I want boxes that are normally pretty idle to execute in a predictable fashion for sub second queries. I assume you checked with a steam benchmark results on the AMD machines to make sure they are getting in the ballpark of where they are supposed to ? -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance