Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So how about the removal of the "AND" clause? On a test server, this
drops the query from 201204 to 438 ms.
Is this just random, or is it a real solution that might apply to any
arbitrary combination of words?

Attached are three test runs:
Total runtime: 201204.972 ms
Total runtime: 437.766 ms
Total runtime: 341.727 ms

preproduction-20091214=# ALTER TABLE article_words ALTER COLUMN word_key SET STATISTICS 50;
ALTER TABLE
preproduction-20091214=# ANALYZE VERBOSE article_words;
INFO:  analyzing "public.article_words"
INFO:  "article_words": scanned 75000 of 1342374 pages, containing 16124750 live rows and 825250 dead rows; 75000 rows in sample, 288605935 estimated total rows
ANALYZE


preproduction-20091214=# EXPLAIN ANALYZE
SELECT contexts.context_key FROM contexts
JOIN articles ON (articles.context_key=contexts.context_key)
JOIN matview_82034 ON (contexts.context_key=matview_82034.context_key)
WHERE contexts.context_key IN
    (SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider'
     INTERSECT
     SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'trading')
AND contexts.context_key IN
         (SELECT a.context_key FROM virtual_ancestors a JOIN bp_categories ON (a.ancestor_key = bp_categories.context_key)
         WHERE lower(bp_categories.category) = 'law') AND articles.indexed;
                                                                                              QUERY PLAN                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=12665.41..61953.71 rows=2 width=4) (actual time=140.799..201204.025 rows=546 loops=1)
   ->  Nested Loop IN Join  (cost=12665.41..61951.82 rows=1 width=16) (actual time=140.786..201196.526 rows=546 loops=1)
         Join Filter: (a.context_key = articles.context_key)
         ->  Nested Loop  (cost=12665.41..12678.98 rows=2 width=12) (actual time=55.453..116.790 rows=1306 loops=1)
               ->  Nested Loop  (cost=12665.41..12675.18 rows=2 width=8) (actual time=55.439..98.132 rows=1306 loops=1)
                     ->  Subquery Scan "IN_subquery"  (cost=12665.41..12665.68 rows=5 width=4) (actual time=55.415..76.025 rows=1473 loops=1)
                           ->  SetOp Intersect  (cost=12665.41..12665.64 rows=5 width=4) (actual time=55.413..73.885 rows=1473 loops=1)
                                 ->  Sort  (cost=12665.41..12665.52 rows=46 width=4) (actual time=55.406..62.222 rows=17892 loops=1)
                                       Sort Key: "*SELECT* 1".context_key
                                       Sort Method:  quicksort  Memory: 1607kB
                                       ->  Append  (cost=0.00..12664.14 rows=46 width=4) (actual time=0.060..42.065 rows=17892 loops=1)
                                             ->  Subquery Scan "*SELECT* 1"  (cost=0.00..6332.07 rows=23 width=4) (actual time=0.059..6.962 rows=3583 loops=1)
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual time=0.058..5.148 rows=3583 loops=1)
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22 rows=1 width=4) (actual time=0.018..0.019 rows=1 loops=1)
                                                               Index Cond: ((word)::text = 'insider'::text)
                                                         ->  Index Scan using article_words_wc on article_words  (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.036..3.275 rows=3583 loops=1)
                                                               Index Cond: (public.article_words.word_key = public.words.word_key)
                                             ->  Subquery Scan "*SELECT* 2"  (cost=0.00..6332.07 rows=23 width=4) (actual time=0.037..27.136 rows=14309 loops=1)
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual time=0.035..19.912 rows=14309 loops=1)
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22 rows=1 width=4) (actual time=0.015..0.017 rows=1 loops=1)
                                                               Index Cond: ((word)::text = 'trading'::text)
                                                         ->  Index Scan using article_words_wc on article_words  (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.018..12.464 rows=14309 loops=1)
                                                               Index Cond: (public.article_words.word_key = public.words.word_key)
                     ->  Index Scan using article_key_idx on articles  (cost=0.00..1.89 rows=1 width=4) (actual time=0.011..0.012 rows=1 loops=1473)
                           Index Cond: (articles.context_key = "IN_subquery".context_key)
                           Filter: articles.indexed
               ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4) (actual time=0.010..0.011 rows=1 loops=1306)
                     Index Cond: (contexts.context_key = articles.context_key)
         ->  Nested Loop  (cost=0.00..59848.83 rows=1124834 width=4) (actual time=0.012..142.771 rows=39193 loops=1306)
               ->  Seq Scan on bp_categories  (cost=0.00..1231.49 rows=16669 width=4) (actual time=0.006..19.719 rows=14552 loops=1306)
                     Filter: (lower(category) = 'law'::text)
               ->  Index Scan using virtual_ancestor_key_idx on virtual_ancestors a  (cost=0.00..2.57 rows=76 width=8) (actual time=0.004..0.006 rows=3 loops=19004304)
                     Index Cond: (a.ancestor_key = bp_categories.context_key)
   ->  Index Scan using matview_82034_ck on matview_82034  (cost=0.00..1.88 rows=1 width=4) (actual time=0.010..0.011 rows=1 loops=546)
         Index Cond: (matview_82034.context_key = articles.context_key)
 Total runtime: 201204.972 ms
(36 rows)





preproduction-20091214=# EXPLAIN ANALYZE
SELECT contexts.context_key FROM contexts
JOIN articles ON (articles.context_key=contexts.context_key)
JOIN matview_82034 ON (contexts.context_key=matview_82034.context_key)
WHERE contexts.context_key IN
    (SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider'
     INTERSECT
     SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'trading')
AND contexts.context_key IN
         (SELECT a.context_key FROM virtual_ancestors a JOIN bp_categories ON (a.ancestor_key = bp_categories.context_key)
         WHERE lower(bp_categories.category) = 'law');

                                                                                              QUERY PLAN                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=90968.66..95382.31 rows=5 width=4) (actual time=284.789..437.241 rows=648 loops=1)
   ->  Nested Loop  (cost=90968.66..95378.53 rows=2 width=16) (actual time=284.778..338.030 rows=649 loops=1)
         ->  Hash IN Join  (cost=90968.66..95374.73 rows=2 width=12) (actual time=284.764..333.145 rows=649 loops=1)
               Hash Cond: (contexts.context_key = a.context_key)
               ->  Nested Loop  (cost=12665.41..12675.20 rows=5 width=8) (actual time=55.557..81.206 rows=1473 loops=1)
                     ->  Subquery Scan "IN_subquery"  (cost=12665.41..12665.68 rows=5 width=4) (actual time=55.537..69.892 rows=1473 loops=1)
                           ->  SetOp Intersect  (cost=12665.41..12665.64 rows=5 width=4) (actual time=55.535..69.075 rows=1473 loops=1)
                                 ->  Sort  (cost=12665.41..12665.52 rows=46 width=4) (actual time=55.528..60.439 rows=17892 loops=1)
                                       Sort Key: "*SELECT* 1".context_key
                                       Sort Method:  quicksort  Memory: 1607kB
                                       ->  Append  (cost=0.00..12664.14 rows=46 width=4) (actual time=0.040..42.014 rows=17892 loops=1)
                                             ->  Subquery Scan "*SELECT* 1"  (cost=0.00..6332.07 rows=23 width=4) (actual time=0.038..6.953 rows=3583 loops=1)
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual time=0.038..5.129 rows=3583 loops=1)
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22 rows=1 width=4) (actual time=0.019..0.020 rows=1 loops=1)
                                                               Index Cond: ((word)::text = 'insider'::text)
                                                         ->  Index Scan using article_words_wc on article_words  (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.015..3.292 rows=3583 loops=1)
                                                               Index Cond: (public.article_words.word_key = public.words.word_key)
                                             ->  Subquery Scan "*SELECT* 2"  (cost=0.00..6332.07 rows=23 width=4) (actual time=0.037..27.131 rows=14309 loops=1)
                                                   ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual time=0.036..19.717 rows=14309 loops=1)
                                                         ->  Index Scan using words_word on words  (cost=0.00..2.22 rows=1 width=4) (actual time=0.016..0.018 rows=1 loops=1)
                                                               Index Cond: ((word)::text = 'trading'::text)
                                                         ->  Index Scan using article_words_wc on article_words  (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.017..12.360 rows=14309 loops=1)
                                                               Index Cond: (public.article_words.word_key = public.words.word_key)
                     ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1473)
                           Index Cond: (contexts.context_key = "IN_subquery".context_key)
               ->  Hash  (cost=59848.83..59848.83 rows=1124834 width=4) (actual time=220.171..220.171 rows=48207 loops=1)
                     ->  Nested Loop  (cost=0.00..59848.83 rows=1124834 width=4) (actual time=0.029..175.278 rows=48207 loops=1)
                           ->  Seq Scan on bp_categories  (cost=0.00..1231.49 rows=16669 width=4) (actual time=0.015..23.290 rows=16669 loops=1)
                                 Filter: (lower(category) = 'law'::text)
                           ->  Index Scan using virtual_ancestor_key_idx on virtual_ancestors a  (cost=0.00..2.57 rows=76 width=8) (actual time=0.004..0.007 rows=3 loops=16669)
                                 Index Cond: (a.ancestor_key = bp_categories.context_key)
         ->  Index Scan using article_key_idx on articles  (cost=0.00..1.89 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=649)
               Index Cond: (articles.context_key = contexts.context_key)
   ->  Index Scan using matview_82034_ck on matview_82034  (cost=0.00..1.88 rows=1 width=4) (actual time=0.151..0.152 rows=1 loops=649)
         Index Cond: (matview_82034.context_key = articles.context_key)
 Total runtime: 437.766 ms
(36 rows)


preproduction-20091214=# EXPLAIN ANALYZE
SELECT contexts.context_key FROM articles JOIN contexts using (context_key)
WHERE articles.context_key IN
(
SELECT contexts.context_key FROM contexts
JOIN matview_82034 ON (contexts.context_key=matview_82034.context_key)
WHERE contexts.context_key IN
    (SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'insider'
     INTERSECT
     SELECT context_key FROM article_words JOIN words using (word_key) WHERE word = 'trading')
AND contexts.context_key IN
    (SELECT a.context_key FROM virtual_ancestors a JOIN bp_categories ON (a.ancestor_key = bp_categories.context_key)
     WHERE lower(bp_categories.category) = 'law')
)
AND articles.indexed;
                                                                                                    QUERY PLAN                                                         
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=95378.52..95391.87 rows=2 width=4) (actual time=332.545..341.194 rows=546 loops=1)
   ->  Nested Loop  (cost=95378.52..95388.07 rows=2 width=20) (actual time=332.531..337.526 rows=546 loops=1)
         ->  HashAggregate  (cost=95378.52..95378.57 rows=5 width=16) (actual time=332.514..332.830 rows=648 loops=1)
               ->  Nested Loop  (cost=90968.66..95378.51 rows=5 width=16) (actual time=285.869..331.943 rows=648 loops=1)
                     ->  Hash IN Join  (cost=90968.66..95374.73 rows=2 width=12) (actual time=285.858..327.625 rows=649 loops=1)
                           Hash Cond: (public.contexts.context_key = a.context_key)
                           ->  Nested Loop  (cost=12665.41..12675.20 rows=5 width=8) (actual time=55.625..79.734 rows=1473 loops=1)
                                 ->  Subquery Scan "IN_subquery"  (cost=12665.41..12665.68 rows=5 width=4) (actual time=55.602..69.942 rows=1473 loops=1)
                                       ->  SetOp Intersect  (cost=12665.41..12665.64 rows=5 width=4) (actual time=55.601..69.115 rows=1473 loops=1)
                                             ->  Sort  (cost=12665.41..12665.52 rows=46 width=4) (actual time=55.593..60.422 rows=17892 loops=1)
                                                   Sort Key: "*SELECT* 1".context_key
                                                   Sort Method:  quicksort  Memory: 1607kB
                                                   ->  Append  (cost=0.00..12664.14 rows=46 width=4) (actual time=0.039..41.913 rows=17892 loops=1)
                                                         ->  Subquery Scan "*SELECT* 1"  (cost=0.00..6332.07 rows=23 width=4) (actual time=0.039..6.911 rows=3583 loops=1)
                                                               ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual time=0.037..5.043 rows=3583 loops=1)
                                                                     ->  Index Scan using words_word on words  (cost=0.00..2.22 rows=1 width=4) (actual time=0.018..0.019 rows=1 loops=1)
                                                                           Index Cond: ((word)::text = 'insider'::text)
                                                                     ->  Index Scan using article_words_wc on article_words  (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.015..3.202 rows=3583 loops=1)
                                                                           Index Cond: (public.article_words.word_key = public.words.word_key)
                                                         ->  Subquery Scan "*SELECT* 2"  (cost=0.00..6332.07 rows=23 width=4) (actual time=0.035..27.132 rows=14309 loops=1)
                                                               ->  Nested Loop  (cost=0.00..6331.84 rows=23 width=4) (actual time=0.034..20.008 rows=14309 loops=1)
                                                                     ->  Index Scan using words_word on words  (cost=0.00..2.22 rows=1 width=4) (actual time=0.014..0.015 rows=1 loops=1)
                                                                           Index Cond: ((word)::text = 'trading'::text)
                                                                     ->  Index Scan using article_words_wc on article_words  (cost=0.00..6227.18 rows=8195 width=8) (actual time=0.018..12.502 rows=14309 loops=1)
                                                                           Index Cond: (public.article_words.word_key = public.words.word_key)
                                 ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1473)
                                       Index Cond: (public.contexts.context_key = "IN_subquery".context_key)
                           ->  Hash  (cost=59848.83..59848.83 rows=1124834 width=4) (actual time=221.465..221.465 rows=48207 loops=1)
                                 ->  Nested Loop  (cost=0.00..59848.83 rows=1124834 width=4) (actual time=0.030..174.941 rows=48207 loops=1)
                                       ->  Seq Scan on bp_categories  (cost=0.00..1231.49 rows=16669 width=4) (actual time=0.015..23.377 rows=16669 loops=1)
                                             Filter: (lower(category) = 'law'::text)
                                       ->  Index Scan using virtual_ancestor_key_idx on virtual_ancestors a  (cost=0.00..2.57 rows=76 width=8) (actual time=0.004..0.007 rows=3 loops=16669)
                                             Index Cond: (a.ancestor_key = bp_categories.context_key)
                     ->  Index Scan using matview_82034_ck on matview_82034  (cost=0.00..1.88 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=649)
                           Index Cond: (matview_82034.context_key = public.contexts.context_key)
         ->  Index Scan using article_key_idx on articles  (cost=0.00..1.89 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=648)
               Index Cond: (articles.context_key = public.contexts.context_key)
               Filter: articles.indexed
   ->  Index Scan using contexts_pkey on contexts  (cost=0.00..1.89 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=546)
         Index Cond: (public.contexts.context_key = articles.context_key)
 Total runtime: 341.727 ms
(41 rows)




preproduction-20091214=# select count(*),indexed from articles group by indexed;
 count  | indexed
--------+---------
 499730 | f
 341930 | t
-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux