Dear all,
I have a query that i used to fire many times in our application and
need to be tuned at the deeper level.
Query :
explain analyze select p.crawled_page_id, p.content,
w.publication_name, w.country_name, p.publishing_date,m.doc_category
,l.display_name as location, l.lat, l.lon, l.pop_rank,
p.crawled_page_url, substring(p.content,1,250)
as display_text, p.heading from page_content_terror p, location l,
loc_context_terror lc, meta_terror m,
website_master w where p.crawled_page_id>0 and
p.crawled_page_id=lc.source_id and lc.location_id=l.id
and p.crawled_page_id=m.doc_id and p.url_id= w.url_id limit 1000;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=483.31..1542.79 rows=1000 width=5460) (actual
time=8.797..125504.603 rows=1000 loops=1)
-> Hash Join (cost=483.31..169719466.71 rows=160190779 width=5460)
(actual time=8.794..125502.974 rows=1000 loops=1)
Hash Cond: (p.url_id = w.url_id)
-> Nested Loop (cost=0.00..163675973.13 rows=13034265
width=4056) (actual time=1.294..125463.784 rows=1000 loops=1)
-> Nested Loop (cost=0.00..115348580.99 rows=13034265
width=3024) (actual time=1.219..125436.104 rows=1156 loops=1)
Join Filter: (p.crawled_page_id = lc.source_id)
-> Nested Loop (cost=0.00..10960127.98 rows=53553
width=3024) (actual time=0.037..66671.887 rows=1156 loops=1)
Join Filter: (p.crawled_page_id = m.doc_id)
-> Seq Scan on page_content_terror p
(cost=0.00..8637.64 rows=2844 width=2816) (actual time=0.013..5.884
rows=1156 loops=1)
Filter: (crawled_page_id > 0)
-> Seq Scan on meta_terror m
(cost=0.00..3803.66 rows=3766 width=208) (actual time=0.003..30.117
rows=45148 loops=1156)
-> Seq Scan on loc_context_terror lc
(cost=0.00..1340.78 rows=48678 width=8) (actual time=0.376..24.675
rows=41658 loops=1156)
-> Index Scan using location_pk on location l
(cost=0.00..3.70 rows=1 width=1040) (actual time=0.016..0.017 rows=1
loops=1156)
Index Cond: (l.id = lc.location_id)
-> Hash (cost=452.58..452.58 rows=2458 width=1412) (actual
time=7.344..7.344 rows=2458 loops=1)
-> Seq Scan on website_master w (cost=0.00..452.58
rows=2458 width=1412) (actual time=0.013..4.094 rows=2458 loops=1)
Total runtime: 125506.007 ms
/***********************************After adding
indexes*******************************************/
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
---------
Limit (cost=483.31..583.30 rows=1000 width=5460) (actual
time=9.769..63.374 rows=1000 loops=1)
-> Hash Join (cost=483.31..871571182.73 rows=8716355049 width=5460)
(actual time=9.765..62.314 rows=1000 loops=1)
Hash Cond: (p.url_id = w.url_id)
-> Nested Loop (cost=0.00..542756386.37 rows=709224822
width=4056) (actual time=1.640..30.895 rows=1000 loops=1)
-> Nested Loop (cost=0.00..2587537.38 rows=3139483
width=3856) (actual time=1.558..22.552 rows=1000 loops=1)
-> Nested Loop (cost=0.00..157876.94 rows=41693
width=1040) (actual time=1.419..13.039 rows=1000 loops=1)
-> Seq Scan on loc_context_terror lc
(cost=0.00..1270.93 rows=41693 width=8) (actual time=1.346..2.264
rows=1156 loops=1)
-> Index Scan using location_pk on location
l (cost=0.00..3.74 rows=1 width=1040) (actual time=0.005..0.006 rows=1
loops=1156)
Index Cond: (l.id = lc.location_id)
-> Index Scan using idx_crawled_s9 on
page_content_terror p (cost=0.00..57.34 rows=75 width=2816) (actual
time=0.005..0.006 rows=1 loo
ps=1000)
Index Cond: ((p.crawled_page_id > 0) AND
(p.crawled_page_id = lc.source_id))
-> Index Scan using idx_doc_s9 on meta_terror m
(cost=0.00..169.23 rows=226 width=208) (actual time=0.005..0.006 rows=1
loops=1000)
Index Cond: (m.doc_id = p.crawled_page_id)
-> Hash (cost=452.58..452.58 rows=2458 width=1412) (actual
time=7.964..7.964 rows=2458 loops=1)
-> Seq Scan on website_master w (cost=0.00..452.58
rows=2458 width=1412) (actual time=0.009..4.495 rows=2458 loops=1)
Total runtime: 64.396 ms
Don't know why it uses Seq Scan on loc_context_terror as i have indexes
on the desired columns as well.
Thanks & best Regards,
Adarsh Sharma
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance