Re: query using incorrect index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You’re right, a lot of information is missing but I’m unsure that the other information will make too much difference.

I could drop all the other indexes on the table which aren’t used here and the queries would still use the indexes they are currently using.

 

I appreciate the idea that a boolean column selectivity might not be great. I’ve just tried creating indexes as follows:

CREATE INDEX messageq17

  ON messageq_table

  USING btree

  (staff_ty, staff_id, incoming, inactive, entity_id);

 

CREATE INDEX messageq18

  ON messageq_table

  USING btree

  (staff_ty, staff_id);

 

When running query 2 as it stands the same thing happens, it still uses the messageq1 index.

 

The query is logically the same as using max, you are correct, but it’s generated on the fly so the limit or the queried column may change.

 

The query plans were for the second query as I’m unsure that the first query is really relevant, it was simply there to justify the messageq1 index.

 

Thanks,

 

From: Robert Klemme [mailto:shortcutter@xxxxxxxxxxxxxx]
Sent: 03 August 2012 10:18
To: Russell Keane; pgsql-performance
Subject: Re: query using incorrect index

 

 

On Thu, Aug 2, 2012 at 4:54 PM, Russell Keane <Russell.Keane@xxxxxxxxxx> wrote:

 

Using PG 9.0 and given 2 queries (messageq_current is a view on the messageq_table):

 

select entity_id from messageq_current

where entity_id = 123456;

 

select entity_id from messageq_current

where incoming = true

and inactive = false

and staff_ty = 2

and staff_id = 2

order by entity_id desc

limit 1;

 

and 2 indexes (there are 15 indexes in total but they are left out here for brevity):

 

messageq1:

CREATE INDEX messageq1

  ON messageq_table

  USING btree

  (entity_id);

 

And messageq4:

 

CREATE INDEX messageq4

  ON messageq_table

  USING btree

  (inactive, staff_ty, staff_id, incoming, tran_dt);


Of course a lot of detail is missing (full schema of table, all the other indexes) but with "inactive" a boolean column I suspect selectivity might not be too good here and so having it as a first column in a covering index is at least questionable.  If query 2 is frequent you might also want to consider creating a partial index only on (staff_ty, staff_id) with filtering criteria on incoming and active as present in query 2.

Btw, why don't you formulate query 2 as max query?

select max(entity_id) as entity_id

from messageq_current

where incoming = true

and inactive = false

and staff_ty = 2

and staff_id = 2;

 

 

With the messageq1 index present, query 1 is very quick (0.094ms) and query 2 is very slow (241.515ms).

If I remove messageq1 then query 2 uses messageq4 and is very quick (0.098ms) but then query 1 must use a different index and is therefore slower (> 5ms).

 

So, to the Query plans:


Of which query?  Shouldn't there be four plans in total?  I'd post plans here:
http://explain.depesz.com/
 

With messageq1:

"Limit  (cost=0.00..2670.50 rows=1 width=4) (actual time=241.481..241.481 rows=0 loops=1)"

"  Output: messageq_table.entity_id"

"  Buffers: shared hit=32 read=18870 written=12"

"  ->  Index Scan Backward using messageq1 on prac_live_10112.messageq_table  (cost=0.00..66762.53 rows=25 width=4) (actual time=241.479..241.479 rows=0 loops=1)"

"        Output: messageq_table.entity_id"

"        Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (aud_status_to_flag(messageq_table.aud_status) = 1))"

"        Buffers: shared hit=32 read=18870 written=12"

"Total runtime: 241.515 ms"

 

Without messageq1:

"Limit  (cost=12534.45..12534.45 rows=1 width=4) (actual time=0.055..0.055 rows=0 loops=1)"

"  Output: messageq_table.entity_id"

"  Buffers: shared read=3"

"  ->  Sort  (cost=12534.45..12534.51 rows=25 width=4) (actual time=0.054..0.054 rows=0 loops=1)"

"        Output: messageq_table.entity_id"

"        Sort Key: messageq_table.entity_id"

"        Sort Method:  quicksort  Memory: 17kB"

"        ->  Bitmap Heap Scan on prac_live_10112.messageq_table  (cost=174.09..12534.32 rows=25 width=4) (actual time=0.043..0.043 rows=0 loops=1)"

"              Output: messageq_table.entity_id"

"              Recheck Cond: ((messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2))"

"              Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status) = 1))"

"              Buffers: shared read=3"

"              ->  Bitmap Index Scan on messageq4  (cost=0.00..174.08 rows=4920 width=0) (actual time=0.040..0.040 rows=0 loops=1)"

"                    Index Cond: ((messageq_table.inactive = false) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (messageq_table.incoming = true))"

"                    Buffers: shared read=3"

"Total runtime: 0.098 ms"

 

Clearly the statistics are off somehow but I really don’t know where to start.

 

Any help you can give me would be very much appreciated.


Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux