[ Attempting to resend, because it didn't seem to get through last time. ] We have a query that runs very slowly on our 8.3 database. (I can't tell you exactly how slowly, because it has never successfully run to completion even when we left it running overnight.) On the 8.4 database on my laptop, it runs in about 90 seconds. Of course there are several differences between the two instances, but I wonder whether query planning improvements in 8.4 could essentially account for it. (In case you're wondering, I ran 'vacuum full analyze' on the slow one, and it made no discernible difference.) The two instances have the same schema and data. The query looks like this: select first.feature_cvterm_id from feature_cvterm first join feature_cvterm_dbxref first_fvd on first.feature_cvterm_id = first_fvd.feature_cvterm_id join dbxref first_withfrom_dbxref on first_fvd.dbxref_id = first_withfrom_dbxref.dbxref_id , cv , cvterm evidence_type join cv evidence_type_cv on evidence_type.cv_id = evidence_type_cv.cv_id , feature_cvtermprop first_evidence , feature_cvterm second left join feature_cvtermprop second_evidence on second_evidence.feature_cvterm_id = second.feature_cvterm_id join feature_cvterm_dbxref second_fvd on second.feature_cvterm_id = second_fvd.feature_cvterm_id join dbxref second_withfrom_dbxref on second_fvd.dbxref_id = second_withfrom_dbxref.dbxref_id , cvterm second_term , cvterm first_term , feature where first.cvterm_id = first_term.cvterm_id and first_evidence.feature_cvterm_id = first.feature_cvterm_id and second_term.cv_id = cv.cv_id and first_term.cv_id = cv.cv_id and cv.name in ( 'biological_process' , 'molecular_function' , 'cellular_component' ) and second.feature_id = feature.feature_id and second.feature_id = first.feature_id and first.cvterm_id = first_term.cvterm_id and second.cvterm_id = second_term.cvterm_id and second.pub_id = first.pub_id and evidence_type.name = 'evidence' and evidence_type_cv.name = 'genedb_misc' and second_evidence.type_id = evidence_type.cvterm_id and first_evidence.type_id = evidence_type.cvterm_id and second.feature_cvterm_id > first.feature_cvterm_id and first_withfrom_dbxref.accession = second_withfrom_dbxref.accession and upper(first_evidence.value) = upper(second_evidence.value) and first_term.name = second_term.name ; (There's some fairly obvious room for improvement in this query as written, but none of the changes I've tried have changed the overall performance picture.) The execution plan on the (slow) 8.3 server is: Nested Loop (cost=44050.86..77140.03 rows=1 width=4) Join Filter: (second_term.cv_id = cv.cv_id) -> Nested Loop (cost=44050.86..77138.61 rows=1 width=12) Join Filter: ((first_term.cv_id = second_term.cv_id) AND ((first_term.name)::text = (second_term.name)::text)) -> Nested Loop (cost=44050.86..77130.32 rows=1 width=56) -> Nested Loop (cost=44050.86..77122.65 rows=1 width=12) Join Filter: (upper(second_evidence.value) = upper(first_evidence.value)) -> Nested Loop (cost=44050.86..77114.32 rows=1 width=50) Join Filter: ((second.feature_cvterm_id > first.feature_cvterm_id) AND (second.feature_id = first.feature_id) AND (second.pub_id = first.pub_id) AND ((second_withfrom_dbxref.accession)::text = (first_withfrom_dbxref.accession)::text)) -> Nested Loop (cost=30794.26..42915.70 rows=1 width=69) -> Hash Join (cost=30794.26..42906.88 rows=1 width=65) Hash Cond: (second_evidence.type_id = evidence_type.cvterm_id) -> Hash Join (cost=30784.59..42807.07 rows=24035 width=61) Hash Cond: (second_fvd.dbxref_id = second_withfrom_dbxref.dbxref_id) -> Hash Join (cost=19044.44..28262.26 rows=24035 width=50) Hash Cond: (second_evidence.feature_cvterm_id = second.feature_cvterm_id) -> Seq Scan on feature_cvtermprop second_evidence (cost=0.00..4370.07 rows=223307 width=34) -> Hash (cost=18169.19..18169.19 rows=47620 width=24) -> Hash Join (cost=1516.45..18169.19 rows=47620 width=24) Hash Cond: (second.feature_cvterm_id = second_fvd.feature_cvterm_id) -> Seq Scan on feature_cvterm second (cost=0.00..7243.27 rows=442427 width=16) -> Hash (cost=734.20..734.20 rows=47620 width=8) -> Seq Scan on feature_cvterm_dbxref second_fvd (cost=0.00..734.20 rows=47620 width=8) -> Hash (cost=5838.29..5838.29 rows=321429 width=19) -> Seq Scan on dbxref second_withfrom_dbxref (cost=0.00..5838.29 rows=321429 width=19) -> Hash (cost=9.66..9.66 rows=1 width=4) -> Nested Loop (cost=0.00..9.66 rows=1 width=4) Join Filter: (evidence_type.cv_id = evidence_type_cv.cv_id) -> Seq Scan on cv evidence_type_cv (cost=0.00..1.35 rows=1 width=4) Filter: ((name)::text = 'genedb_misc'::text) -> Index Scan using cvterm_idx2 on cvterm evidence_type (cost=0.00..8.29 rows=1 width=8) Index Cond: ((evidence_type.name)::text = 'evidence'::text) -> Index Scan using feature_pkey on feature (cost=0.00..8.81 rows=1 width=4) Index Cond: (feature.feature_id = second.feature_id) -> Hash Join (cost=13256.60..33246.22 rows=47620 width=35) Hash Cond: (first_fvd.dbxref_id = first_withfrom_dbxref.dbxref_id) -> Hash Join (cost=1516.45..18169.19 rows=47620 width=24) Hash Cond: (first.feature_cvterm_id = first_fvd.feature_cvterm_id) -> Seq Scan on feature_cvterm first (cost=0.00..7243.27 rows=442427 width=16) -> Hash (cost=734.20..734.20 rows=47620 width=8) -> Seq Scan on feature_cvterm_dbxref first_fvd (cost=0.00..734.20 rows=47620 width=8) -> Hash (cost=5838.29..5838.29 rows=321429 width=19) -> Seq Scan on dbxref first_withfrom_dbxref (cost=0.00..5838.29 rows=321429 width=19) -> Index Scan using feature_cvtermprop_c1 on feature_cvtermprop first_evidence (cost=0.00..8.31 rows=1 width=34) Index Cond: ((first_evidence.feature_cvterm_id = first.feature_cvterm_id) AND (first_evidence.type_id = second_evidence.type_id)) -> Index Scan using cvterm_pkey on cvterm first_term (cost=0.00..7.65 rows=1 width=52) Index Cond: (first_term.cvterm_id = first.cvterm_id) -> Index Scan using cvterm_pkey on cvterm second_term (cost=0.00..8.28 rows=1 width=52) Index Cond: (second_term.cvterm_id = second.cvterm_id) -> Seq Scan on cv (cost=0.00..1.39 rows=3 width=4) Filter: ((cv.name)::text = ANY ('{biological_process,molecular_function,cellular_component}'::text[])) and on the (fast) 8.4 server is: Nested Loop (cost=63949.73..77767.20 rows=1 width=4) Join Filter: (second_term.cv_id = cv.cv_id) -> Nested Loop (cost=63949.73..77765.78 rows=1 width=12) Join Filter: (upper(second_evidence.value) = upper(first_evidence.value)) -> Nested Loop (cost=63949.73..77757.46 rows=1 width=51) Join Filter: ((first_term.cv_id = second_term.cv_id) AND ((first_term.name)::text = (second_term.name)::text)) -> Nested Loop (cost=63949.73..77749.17 rows=1 width=95) -> Nested Loop (cost=63949.73..77741.55 rows=1 width=51) -> Hash Join (cost=63949.73..77732.49 rows=1 width=59) Hash Cond: ((second.feature_id = first.feature_id) AND (second.pub_id = first.pub_id) AND ((second_withfrom_dbxref.accession)::text = (first_withfrom_dbxref.accession)::text)) Join Filter: (second.feature_cvterm_id > first.feature_cvterm_id) -> Hash Join (cost=30236.57..41303.13 rows=4607 width=66) Hash Cond: (second_evidence.type_id = evidence_type.cvterm_id) -> Hash Join (cost=30226.90..41161.01 rows=23034 width=62) Hash Cond: (second_fvd.dbxref_id = second_withfrom_dbxref.dbxref_id) -> Hash Join (cost=18735.19..27081.42 rows=23034 width=51) Hash Cond: (second_evidence.feature_cvterm_id = second.feature_cvterm_id) -> Seq Scan on feature_cvtermprop second_evidence (cost=0.00..3965.92 rows=210392 width=35) -> Hash (cost=17861.31..17861.31 rows=47590 width=24) -> Hash Join (cost=1490.78..17861.31 rows=47590 width=24) Hash Cond: (second.feature_cvterm_id = second_fvd.feature_cvterm_id) -> Seq Scan on feature_cvterm second (cost=0.00..7115.82 rows=434682 width=16) -> Hash (cost=709.90..709.90 rows=47590 width=8) -> Seq Scan on feature_cvterm_dbxref second_fvd (cost=0.00..709.90 rows=47590 width=8) -> Hash (cost=5743.87..5743.87 rows=321587 width=19) -> Seq Scan on dbxref second_withfrom_dbxref (cost=0.00..5743.87 rows=321587 width=19) -> Hash (cost=9.66..9.66 rows=1 width=4) -> Nested Loop (cost=0.00..9.66 rows=1 width=4) Join Filter: (evidence_type.cv_id = evidence_type_cv.cv_id) -> Seq Scan on cv evidence_type_cv (cost=0.00..1.35 rows=1 width=4) Filter: ((name)::text = 'genedb_misc'::text) -> Index Scan using cvterm_idx2 on cvterm evidence_type (cost=0.00..8.29 rows=1 width=8) Index Cond: ((evidence_type.name)::text = 'evidence'::text) -> Hash (cost=32531.33..32531.33 rows=47590 width=35) -> Hash Join (cost=12982.48..32531.33 rows=47590 width=35) Hash Cond: (first_fvd.dbxref_id = first_withfrom_dbxref.dbxref_id) -> Hash Join (cost=1490.78..17861.31 rows=47590 width=24) Hash Cond: (first.feature_cvterm_id = first_fvd.feature_cvterm_id) -> Seq Scan on feature_cvterm first (cost=0.00..7115.82 rows=434682 width=16) -> Hash (cost=709.90..709.90 rows=47590 width=8) -> Seq Scan on feature_cvterm_dbxref first_fvd (cost=0.00..709.90 rows=47590 width=8) -> Hash (cost=5743.87..5743.87 rows=321587 width=19) -> Seq Scan on dbxref first_withfrom_dbxref (cost=0.00..5743.87 rows=321587 width=19) -> Index Scan using feature_pkey on feature (cost=0.00..9.04 rows=1 width=4) Index Cond: (feature.feature_id = second.feature_id) -> Index Scan using cvterm_pkey on cvterm first_term (cost=0.00..7.61 rows=1 width=52) Index Cond: (first_term.cvterm_id = first.cvterm_id) -> Index Scan using cvterm_pkey on cvterm second_term (cost=0.00..8.27 rows=1 width=52) Index Cond: (second_term.cvterm_id = second.cvterm_id) -> Index Scan using feature_cvtermprop_c1 on feature_cvtermprop first_evidence (cost=0.00..8.30 rows=1 width=35) Index Cond: ((first_evidence.feature_cvterm_id = first.feature_cvterm_id) AND (first_evidence.type_id = second_evidence.type_id)) -> Seq Scan on cv (cost=0.00..1.39 rows=3 width=4) Filter: ((cv.name)::text = ANY ('{biological_process,molecular_function,cellular_component}'::text[])) Any insights would be much appreciated. Thanks, Robin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance