Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes:

select forum, * from post where
    marked is not true
    and toppost = 1
    and (select login from ignore_thread where login='xxx' and number=post.number) is null
    and (replied > now() - '30 days'::interval)
    and (replied > (select lastview from forumlog where login='xxx' and forum=post.forum and number is null)) is not false
    and (replied > (select lastview from forumlog where login='xxx' and forum=post.forum and number=post.number)) is not false
    and ((post.forum = (select who from excludenew where who='xxx' and forum_name = post.forum)) or (select who from excludenew where who='xxx' and forum_name = post.forum) is null)
   order by pinned desc, replied desc offset 0 limit 100

Returns the following query plan:
                                                              QUERY PLAN                                                                                                                           
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5301575.63..5301575.63 rows=1 width=433) (actual time=771.000..771.455 rows=100 loops=1)
   ->  Sort  (cost=5301575.63..5301575.63 rows=1 width=433) (actual time=770.996..771.141 rows=100 loops=1)
         Sort Key: post.pinned, post.replied
         Sort Method:  top-N heapsort  Memory: 120kB
         ->  Index Scan using post_top on post  (cost=0.00..5301575.62 rows=1 width=433) (actual time=0.664..752.994 rows=3905 loops=1)
               Index Cond: (toppost = 1)
               Filter: ((marked IS NOT TRUE) AND (replied > (now() - '30 days'::interval)) AND ((SubPlan 1) IS NULL) AND ((replied > (SubPlan 2)) IS NOT FALSE) AND ((replied > (SubPlan 3)) IS NOT FALSE) AND ((forum = (SubPlan 4)) OR ((SubPlan 5) IS NULL)))
               SubPlan 1
                 ->  Seq Scan on ignore_thread  (cost=0.00..5.45 rows=1 width=7) (actual time=0.037..0.037 rows=0 loops=3905)
                       Filter: ((login = 'xxx'::text) AND (number = $0))
               SubPlan 2
                 ->  Index Scan using forumlog_composite on forumlog  (cost=0.00..9.50 rows=1 width=8) (actual time=0.008..0.008 rows=0 loops=3905)
                       Index Cond: ((login = 'xxx'::text) AND (forum = $1) AND (number IS NULL))
               SubPlan 3
                 ->  Index Scan using forumlog_composite on forumlog  (cost=0.00..9.50 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=3905)
                       Index Cond: ((login = 'xxx'::text) AND (forum = $1) AND (number = $0))
               SubPlan 4
                 ->  Index Scan using excludenew_pkey on excludenew  (cost=0.00..8.27 rows=1 width=9) (actual time=0.004..0.004 rows=0 loops=3905)
                       Index Cond: ((who = 'xxx'::text) AND (forum_name = $1))
               SubPlan 5
                 ->  Index Scan using excludenew_pkey on excludenew  (cost=0.00..8.27 rows=1 width=9) (actual time=0.004..0.004 rows=0 loops=3905)
                       Index Cond: ((who = 'xxx'::text) AND (forum_name = $1))
 Total runtime: 771.907 ms
(23 rows)

The alternative:

select forum, * from post where
    marked is not true
    and toppost = 1
    and not exists (select login from ignore_thread where login='xxx' and number=post.number)
    and (replied > now() - '30 days'::interval)
    and (replied > (select lastview from forumlog where login='xxx' and forum=post.forum and number is null)) is not false
    and (replied > (select lastview from forumlog where login='xxx' and forum=post.forum and number=post.number)) is not false
    and ((post.forum = (select who from excludenew where who='xxx' and forum_name = post.forum)) or (select who from excludenew where who='xxx' and forum_name = post.forum) is null)
   order by pinned desc, replied desc offset 0 limit 100

goes nuts.

(Yes, I know, most of those others which are "not false" could be "Exists" too)

Explain Analyze on the alternative CLAIMS the same query planner time (within a few milliseconds) with explain analyze.  But if I replace the executing code with one that has the alternative ("not exists") syntax in it, the system load goes to crap instantly and the execution times "in the wild" go bananas.

I don't know why it does - I just know THAT it does.  When I first added that top clause in there (to allow people an individual "ignore thread" list) the system load went bananas immediately and forced me to back it out.   When I re-wrote the query as above the performance was (and remains) fine.

I'm running 8.4.2.

I agree (in advance) it shouldn't trash performance - all I know is that it does and forced me to re-write the query.


Kevin Grittner wrote:
Karl Denninger <karl@xxxxxxxxxxxxx> wrote:
Kevin Grittner wrote:
 
  
I suspect that the above might do pretty well in 8.4.
      
 
  
"Exists" can be quite slow.  So can "not exists"

See if you can re-write it using a sub-select - just replace the
"exists ...." with "(select ...) is not null"

Surprisingly this often results in a MUCH better query plan under
Postgresql.  Why the planner evaluates it "better" eludes me (it
shouldn't) but the differences are often STRIKING - I've seen
factor-of-10 differences in execution performance.
    
 
Have you seen such a difference under 8.4?  Can you provide a
self-contained example?
 
-Kevin

  
begin:vcard
fn:Karl Denninger
n:Denninger;Karl
email;internet:karl@xxxxxxxxxxxxx
x-mozilla-html:TRUE
version:2.1
end:vcard

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux