Re: Planner doesn't take indexes into account

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



random_page_cost = 4.0
seq_page_cost = 1.0

There is about 500,000 rows and about 500 new rows each business day.

About 96% of rows meet given conditions, that is, count shoud be about 480,000.

BR,
Grzegorz Olszewski


Date: Tue, 27 May 2014 14:14:21 -0700
Subject: Re: [PERFORM] Planner doesn't take indexes into account
From: rummandba@xxxxxxxxx
To: grzegorz.olszewski@xxxxxxxxxxx
CC: pgsql-performance@xxxxxxxxxxxxxx

What is random_page_cost and seq_page_cost in your server?
And how many rows does the table have?


On Tue, May 27, 2014 at 2:09 PM, Grzegorz Olszewski <grzegorz.olszewski@xxxxxxxxxxx> wrote:
Hi,

I wonder why planner uses Seq Scan instead of Index Scan.

Here is my table (partial):
content.contents
-------------------------+-----------------------------+-----------------------------------------------------------------
 id                      | bigint                      | niepusty domyślnie nextval('content.contents_id_seq'::regclass)
 version                 | integer                     | niepusty
 date_published          | timestamp without time zone |
 moderation_status       | character varying(50)       |
 publication_status      | character varying(30)       |

And indexes (there are some other indexes too):
    "contents_id_pkey" PRIMARY KEY, btree (id)
    "contents_date_published_idx" btree (date_published)
    "contents_moderation_status_idx" btree (moderation_status)
    "contents_publication_status_idx" btree (publication_status)

I tried also creating following indexes:
    "contents_date_published_publication_status_moderation_statu_idx" btree (date_published, publication_status, moderation_status)
    "contents_publication_status_idx1" btree ((publication_status::text))
    "contents_moderation_status_idx1" btree ((moderation_status::text))

Then for this query (genrated by Hibernate):
explain (analyze, buffers) select count(*) as y0_ from content.contents this_ inner join content.content_categories cat1_ on this_.CONTENT_CATEGORY_ID=cat1_.ID where cat1_.name in ([...])
and this_.date_published<='2014-05-26 12:23:31.557000 +02:00:00'
and (this_.PUBLICATION_STATUS is null or this_.PUBLICATION_STATUS<>'DRAFT')
and (this_.moderation_status is null or this_.moderation_status<>'DANGEROUS')
and exists(select * from content.content_visibilities cv where cv.content_id = this_.ID and cv.user_group_id in (1,2));

Planner creates such plan:
                                                                                                                                  QUERY PLAN                                                                                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Semi Join  (cost=31706.84..106020.81 rows=21871 width=2076) (actual time=1197.658..6012.406 rows=430218 loops=1)
   Hash Cond: (this_.id = cv.content_id)
   Buffers: shared hit=5 read=59031 written=3, temp read=47611 written=47549
   ->  Hash Join  (cost=2.22..56618.11 rows=22881 width=2076) (actual time=0.163..1977.304 rows=430221 loops=1)
         Hash Cond: (this_.content_category_id = cat1_.id)
         Buffers: shared hit=1 read=46829 written=1
         ->  Seq Scan on contents this_  (cost=0.00..54713.92 rows=446176 width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
               Filter: ((date_published <= '2014-05-26 12:23:31.557'::timestamp without time zone) AND ((publication_status IS NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND ((moderation_status IS NULL) OR ((moderation_status)::text <> 'DANGEROUS'::text)))
               Rows Removed by Filter: 50
               Buffers: shared read=46829 written=1
         ->  Hash  (cost=2.17..2.17 rows=4 width=46) (actual time=0.089..0.089 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               Buffers: shared hit=1
               ->  Seq Scan on content_categories cat1_  (cost=0.00..2.17 rows=4 width=46) (actual time=0.053..0.076 rows=4 loops=1)
                     Filter: ((name)::text = ANY ('{przeglad-prasy/rp,przeglad-prasy/parkiet,komunikat-z-rynku-pap-emitent,komunikat-z-rynku-pap-depesze}'::text[]))
                     Rows Removed by Filter: 74
                     Buffers: shared hit=1
   ->  Hash  (cost=24435.09..24435.09 rows=443083 width=8) (actual time=1197.146..1197.146 rows=447624 loops=1)
         Buckets: 4096  Batches: 32  Memory Usage: 560kB
         Buffers: shared hit=4 read=12202 written=2, temp written=1467
         ->  Bitmap Heap Scan on content_visibilities cv  (cost=7614.55..24435.09 rows=443083 width=8) (actual time=61.034..647.729 rows=447624 loops=1)
               Recheck Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
               Buffers: shared hit=4 read=12202 written=2
               ->  Bitmap Index Scan on content_visibilities_user_group_id_idx  (cost=0.00..7503.78 rows=443083 width=0) (actual time=58.680..58.680 rows=447626 loops=1)
                     Index Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
                     Buffers: shared hit=3 read=1226
 Total runtime: 6364.689 ms
(27 wierszy)

The suspicious part is:
         ->  Seq Scan on contents this_  (cost=0.00..54713.92 rows=446176 width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
               Filter: ((date_published <= '2014-05-26 12:23:31.557'::timestamp without time zone) AND ((publication_status IS NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND ((moderation_status IS NULL) OR ((moderation_status)::text <> 'DANGEROUS'::text)))

I don't understand why planner doesn't use indexes. The problem is there are about 0.5M rows satisfying condition (almost every row in the table). Could you please explain this behavior?

I'm using PostgreSQL 9.2.8 on Ubuntu 12.04 LTS x86_64

Best regards,
Grzegorz Olszewski


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux