Re: Simple Query not using Primary Key Index

Aron Podrigal <aronp@xxxxxxxxxxxxxxxxxx> · Tue, 07 Feb 2017 00:12:36 +0000

Here is the dumb part 😱 Turns out what caused my confusion that I had an identical table in another schema and I used different sessions with a different search_path with different results. 
Thank you  for helping me on this.

On Mon, Feb 6, 2017, 7:03 PM Vitaly Burovoy <vitaly.burovoy@xxxxxxxxx> wrote:
On 2/6/17, Aron Podrigal <aronp@xxxxxxxxxxxxxxxxxx> wrote:

> In general, I do not understand why a PK index should not be used when the

> query can be satisfied by the index itself. Can anyone give some reason to

> this?

>

> On Mon, Feb 6, 2017, 6:29 PM Aron Podrigal <aronp@xxxxxxxxxxxxxxxxxx>

> wrote:

>

>> EXPLAIN ANALYZE does not tell me much. It doesn't say why the planner

>> opts

>> for not using the Primary key index.

>>

>> On Mon, Feb 6, 2017, 6:23 PM Alban Hertroys <haramrae@xxxxxxxxx> wrote:

>>

>>

>> > On 7 Feb 2017, at 0:16, Podrigal, Aron <aronp@xxxxxxxxxxxxxxxxxx>

>> > wrote:

>> >

>> > Hi,

>> >

>> > I noticed when I do a simple SELECT id FROM mytable WHERE id =

>> 'cb81d070-4213-465f-b32e-b8db43b83a25'::UUID  Postgres does not use the

>> primary key index and opts for a Seq Scan.

>> >

>> > I of course did VACUUM ANALYZE and I have reset statistics But no sign.

>> Is there any particular thing I should be looking at?

>>

>> An EXPLAIN ANALYZE would be a good start.

>>

As I mentioned before[1], it depends on statistics.

For instance:

postgres=# DROP TABLE IF EXISTS mytable;

DROP TABLE

postgres=# CREATE TABLE mytable(id uuid PRIMARY KEY);

CREATE TABLE

postgres=# INSERT INTO mytable SELECT uuid_generate_v4() FROM

generate_series(1,100);

INSERT 0 100

postgres=# EXPLAIN SELECT id FROM mytable WHERE id =

'cb81d070-4213-465f-b32e-b8db43b83a25'::UUID;  -- 1

                                    QUERY PLAN

----------------------------------------------------------------------------------

 Index Only Scan using mytable_pkey on mytable  (cost=0.15..8.17

rows=1 width=16)

   Index Cond: (id = 'cb81d070-4213-465f-b32e-b8db43b83a25'::uuid)

(2 rows)

postgres=# analyze mytable;

ANALYZE

postgres=# EXPLAIN SELECT id FROM mytable WHERE id =

'cb81d070-4213-465f-b32e-b8db43b83a25'::UUID;  -- 2

                          QUERY PLAN

---------------------------------------------------------------

 Seq Scan on mytable  (cost=0.00..2.25 rows=1 width=16)

   Filter: (id = 'cb81d070-4213-465f-b32e-b8db43b83a25'::uuid)

(2 rows)

postgres=#

postgres=# DROP TABLE IF EXISTS mytable;

DROP TABLE

postgres=# CREATE TABLE mytable(id uuid PRIMARY KEY);

CREATE TABLE

postgres=# INSERT INTO mytable SELECT uuid_generate_v4() FROM

generate_series(1,1000);

INSERT 0 1000

postgres=# analyze mytable;

ANALYZE

postgres=# EXPLAIN SELECT id FROM mytable WHERE id =

'cb81d070-4213-465f-b32e-b8db43b83a25'::UUID;  -- 3

                                    QUERY PLAN

----------------------------------------------------------------------------------

 Index Only Scan using mytable_pkey on mytable  (cost=0.28..8.29

rows=1 width=16)

   Index Cond: (id = 'cb81d070-4213-465f-b32e-b8db43b83a25'::uuid)

(2 rows)

Presence of statistics does not guarantees that indexes will be used.

The first EXPLAIN shows IndexOnlyScan because PG's assumption it will

be faster; the second one shows SeqScan because PG is sure such AM is

faster for 100 rows (I guess they just fit into one page) whereas the

third one shows IndexOnlyScan again because it is more effective than

SeqScan among 100 * 10 rows.

also there are many other factors influence to PG's decision including

random_page_cost and seq_page_cost for tablespaces; fillfactor for

indexes and tables and many more.

You have sent neither table DDL nor EXPLAIN ANALYZE result.

If a query runs fast enough, I would not pay attention to used access method.

[1] https://www.postgresql.org/message-id/CAKOSWNkhGqm6wWuAcrjjJYL0eKNQ6odFREXjgnki9bwA0Hb-6Q%40mail.gmail.com

--

Best regards,

Vitaly Burovoy