Re: Huge difference in query performance between 8.3 and 8.4 (possibly)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Robin Houston <robin.houston@xxxxxxxxx> writes:
> We have a query that runs very slowly on our 8.3 database. (I can't
> tell you exactly how slowly, because it has never successfully run to
> completion even when we left it running overnight.) On the 8.4
> database on my laptop, it runs in about 90 seconds. Of course there
> are several differences between the two instances, but I wonder
> whether query planning improvements in 8.4 could essentially account
> for it.

Well, it's hard to be sure with only EXPLAIN and not EXPLAIN ANALYZE
output to look at; but I think the significant difference in these plans
is that 8.4 has chosen a hash instead of nestloop join for a couple of
the intermediate join levels.  Which is evidently because of a change
in the estimated size of the next join down:

                     ->  Nested Loop  (cost=44050.86..77114.32 rows=1 width=50)
                           Join Filter: ((second.feature_cvterm_id > first.feature_cvterm_id) AND (second.feature_id = first.feature_id) AND (second.pub_id = first.pub_id) AND ((second_withfrom_dbxref.accession)::text = (first_withfrom_dbxref.accession)::text))
                           ->  Nested Loop  (cost=30794.26..42915.70 rows=1 width=69)
                                 ->  Hash Join (cost=30794.26..42906.88 rows=1 width=65)
                                       Hash Cond: (second_evidence.type_id = evidence_type.cvterm_id)

versus

                           ->  Hash Join  (cost=63949.73..77732.49 rows=1 width=59)
                                 Hash Cond: ((second.feature_id = first.feature_id) AND (second.pub_id = first.pub_id) AND ((second_withfrom_dbxref.accession)::text = (first_withfrom_dbxref.accession)::text))
                                 Join Filter: (second.feature_cvterm_id > first.feature_cvterm_id)
                                 ->  Hash Join (cost=30236.57..41303.13 rows=4607 width=66)
                                       Hash Cond: (second_evidence.type_id = evidence_type.cvterm_id)

If the 8.4 rowcount estimate is accurate then it's not surprising that
the nestloop plan sucks --- it'd be re-executing the other arm of the
join 4600 or so times.

This could reflect improvements in the join size estimation code, or
maybe it's just a consequence of 8.4 using larger statistics targets
by default.  It's hard to be sure with so little information to go on.

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux