What is random_page_cost and seq_page_cost in your server?
And how many rows does the table have?
On Tue, May 27, 2014 at 2:09 PM, Grzegorz Olszewski <grzegorz.olszewski@xxxxxxxxxxx> wrote:
Hi,
I wonder why planner uses Seq Scan instead of Index Scan.
Here is my table (partial):
content.contents
-------------------------+-----------------------------+-----------------------------------------------------------------
id | bigint | niepusty domyślnie nextval('content.contents_id_seq'::regclass)
version | integer | niepusty
date_published | timestamp without time zone |
moderation_status | character varying(50) |
publication_status | character varying(30) |
And indexes (there are some other indexes too):
"contents_id_pkey" PRIMARY KEY, btree (id)
"contents_date_published_idx" btree (date_published)
"contents_moderation_status_idx" btree (moderation_status)
"contents_publication_status_idx" btree (publication_status)
I tried also creating following indexes:
"contents_date_published_publication_status_moderation_statu_idx" btree (date_published, publication_status, moderation_status)
"contents_publication_status_idx1" btree ((publication_status::text))
"contents_moderation_status_idx1" btree ((moderation_status::text))
Then for this query (genrated by Hibernate):
explain (analyze, buffers) select count(*) as y0_ from content.contents this_ inner join content.content_categories cat1_ on this_.CONTENT_CATEGORY_ID=cat1_.ID where cat1_.name in ([...])
and this_.date_published<='2014-05-26 12:23:31.557000 +02:00:00'
and (this_.PUBLICATION_STATUS is null or this_.PUBLICATION_STATUS<>'DRAFT')
and (this_.moderation_status is null or this_.moderation_status<>'DANGEROUS')
and exists(select * from content.content_visibilities cv where cv.content_id = this_.ID and cv.user_group_id in (1,2));
Planner creates such plan:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Semi Join (cost=31706.84..106020.81 rows=21871 width=2076) (actual time=1197.658..6012.406 rows=430218 loops=1)
Hash Cond: (this_.id = cv.content_id)
Buffers: shared hit=5 read=59031 written=3, temp read=47611 written=47549
-> Hash Join (cost=2.22..56618.11 rows=22881 width=2076) (actual time=0.163..1977.304 rows=430221 loops=1)
Hash Cond: (this_.content_category_id = cat1_.id)
Buffers: shared hit=1 read=46829 written=1
-> Seq Scan on contents this_ (cost=0.00..54713.92 rows=446176 width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
Filter: ((date_published <= '2014-05-26 12:23:31.557'::timestamp without time zone) AND ((publication_status IS NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND ((moderation_status IS NULL) OR ((moderation_status)::text <> 'DANGEROUS'::text)))
Rows Removed by Filter: 50
Buffers: shared read=46829 written=1
-> Hash (cost=2.17..2.17 rows=4 width=46) (actual time=0.089..0.089 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
Buffers: shared hit=1
-> Seq Scan on content_categories cat1_ (cost=0.00..2.17 rows=4 width=46) (actual time=0.053..0.076 rows=4 loops=1)
Filter: ((name)::text = ANY ('{przeglad-prasy/rp,przeglad-prasy/parkiet,komunikat-z-rynku-pap-emitent,komunikat-z-rynku-pap-depesze}'::text[]))
Rows Removed by Filter: 74
Buffers: shared hit=1
-> Hash (cost=24435.09..24435.09 rows=443083 width=8) (actual time=1197.146..1197.146 rows=447624 loops=1)
Buckets: 4096 Batches: 32 Memory Usage: 560kB
Buffers: shared hit=4 read=12202 written=2, temp written=1467
-> Bitmap Heap Scan on content_visibilities cv (cost=7614.55..24435.09 rows=443083 width=8) (actual time=61.034..647.729 rows=447624 loops=1)
Recheck Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
Buffers: shared hit=4 read=12202 written=2
-> Bitmap Index Scan on content_visibilities_user_group_id_idx (cost=0.00..7503.78 rows=443083 width=0) (actual time=58.680..58.680 rows=447626 loops=1)
Index Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
Buffers: shared hit=3 read=1226
Total runtime: 6364.689 ms
(27 wierszy)
The suspicious part is:
-> Seq Scan on contents this_ (cost=0.00..54713.92 rows=446176 width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
Filter: ((date_published <= '2014-05-26 12:23:31.557'::timestamp without time zone) AND ((publication_status IS NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND ((moderation_status IS NULL) OR ((moderation_status)::text <> 'DANGEROUS'::text)))
I don't understand why planner doesn't use indexes. The problem is there are about 0.5M rows satisfying condition (almost every row in the table). Could you please explain this behavior?
I'm using PostgreSQL 9.2.8 on Ubuntu 12.04 LTS x86_64
Best regards,
Grzegorz Olszewski