Hi, sorry about the blank post yesterday – let’s
try again We have two machines. Both running Linux Redhat, both
running postgres 8.2.5. Both have nearly identical 125 GB databases. In fact we use
PITR Recovery to Replicate from one to the other. The machine we replicate
to runs a query with About 10 inner and left joins about 5 times slower than the original
machine I run an explain on both. Machine1 (original) planner
favors hash joins about 3 to 1 Over nested loop joins. Machine2 (replicated) uses only
nested loop joins – no hash at all. A few details – I can always provide more MACHINE1 – original: TOTAL RAW MEMORY – 30 GB TOTAL SHARED MEMORY (shmmax value) – 4 GB Database configs SHARED_BUFFERS ------------– 1525 MB MAX_PREPARED_TRANSACTIONS – 5 WORK_MEM – -------------------- 300 MB MAINTENANCE_WORK_MEM - 512 MB MAX_FSM_PAGES -------------- 3,000,000 CHECKPOINT_SEGMENTS ----- 64 WAL_BUFFERS --------------------- 768 EFFECTIVE_CACHE_SIZE ---- 2 GB Planner method configs all turned on by default,
including enable_hashjoin MACHINE2 – we run 2 postgres instances. Port 5433
runs continuous PITR recoveries Port 5432 receives the ‘latest and greatest’
database when port 5433 finishes a recovery TOTAL RAW MEMORY – 16 GB (this is a VMWARE
setup on a netapp) TOTAL SHARED MEMORY (shmmax value) – 4 GB Database configs – port 5432 instance SHARED_BUFFERS -----------– 1500 MB MAX_PREPARED_TRANSACTIONS – 1 (we don’t
run prepared transactions here) WORK_MEM – -------------------- 300 MB MAINTENANCE_WORK_MEM - 100 MB (don’t think
this comes into play in this conversation) MAX_FSM_PAGES -------------- 1,000,000 CHECKPOINT_SEGMENTS ----- 32 WAL_BUFFERS --------------------- 768 EFFECTIVE_CACHE_SIZE ---- 2 GB Planner method configs all turned on by default,
including enable_hashjoin Database configs – port 5433 instance SHARED_BUFFERS -----------– 1500 MB MAX_PREPARED_TRANSACTIONS – 1 (we don’t
run prepared transactions here) WORK_MEM – -------------------- 250 MB MAINTENANCE_WORK_MEM - 100 MB (don’t think
this comes into play in this conversation) MAX_FSM_PAGES -------------- 1,000,000 CHECKPOINT_SEGMENTS ----- 32 WAL_BUFFERS --------------------- 768 EFFECTIVE_CACHE_SIZE ---- 2 GB Planner method configs all turned on by default,
including enable_hashjoin Now some size details about the 11 tables involved in the
join All join fields are indexed unless otherwise noted and
are of type integer unless otherwise noted TABLE1 -------------398 pages TABLE2 -------- 5,014 pages INNER JOIN on TABLE1 TABLE3 ------- 34,729 pages INNER JOIN on TABLE2 TABLE4 ----1,828,000 pages INNER JOIN on TABLE2 TABLE5 ----1,838,000 pages INNER JOIN on TABLE4 TABLE6 ------ 122,500 pages INNER JOIN on TABLE4 TABLE7 ----------- 621 pages INNER JOIN on TABLE6 TABLE8 ---------- 4 pages INNER JOIN on TABLE7
(TABLE7 column not indexed) TABLE9 ----------- 2 pages INNER JOIN on TABLE8
(TABLE8 column not indexed) TABLE10 --------- 13 pages LEFT JOIN on TABLE6 (columns
on both tables text, neither column indexed) TABLE11 -1,976,430 pages LEFT JOIN on TABLE5. AND explicit
join on TABLE6 The WHERE clause filters out primary key values
from TABLE1 to 1 value and a 1 month range of Indexed dates from TABLE4. So, my guess is the disparity of performance (40 seconds vs
180 seconds) has to do with MACHINE2 not Availing itself of hash joins which by my understanding is
much faster. Any help / insight appreciated. Thank you Mark Steben│Database
Administrator│ @utoRevenue® "Join the Revenue-tion" @utoRevenue is a registered
trademark and a division of Dominion Enterprises |