Re: Querying with multicolumn index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We aren't using any special planner settings - all enable_* options are "on".

On Sat, Dec 10, 2016 at 12:06 AM, Daniel Blanch Bataller
<daniel.blanch.bataller@xxxxxxxxx> wrote:
> As a quick fix: Have you considered dropping ix_updates_time index?

We do occasionally want to use ix_updates_time, although not very often.

> I’ve been able to reproduce the issue, but with bigger numbers than you. When I dropped ix_updates_time it went much much faster. It uses updates_driver_id_time_idx index instead.
>
> For some reason the planner does not make a good estimation in this case. Can you show us EXPLAIN (ANALYZE, BUFFERS) before and after dropping ix_updates_time index? Can you show us too the output of SHOW shared_buffers; ?

Here's EXPLAIN (ANALYZE, BUFFERS) with the above bad query on a cold cache:

 Limit  (cost=0.09..0.70 rows=1 width=56) (actual
time=244846.915..244846.915 rows=1 loops=1)
   Buffers: shared hit=3999254 read=57831
   I/O Timings: read=242139.661
   ->  Index Scan Backward using ix_updates_time on updates
(cost=0.09..271696.74 rows=442550 width=56) (actual
time=244846.913..244846.913 rows=1 loops=1)
         Filter: (driver_id = 30132)
         Rows Removed by Filter: 5316811
         Buffers: shared hit=3999254 read=57831
         I/O Timings: read=242139.661
 Total runtime: 244846.946 ms

and after dropping ix_updates_time:

 Limit  (cost=0.11..0.98 rows=1 width=56) (actual time=2.270..2.271
rows=1 loops=1)
   Buffers: shared hit=1 read=4
   I/O Timings: read=2.230
   ->  Index Scan Backward using updates_driver_id_time_idx on updates
 (cost=0.11..382307.69 rows=442550 width=56) (actual time=2.270..2.270
rows=1 loops=1)
         Index Cond: (driver_id = 30132)
         Buffers: shared hit=1 read=4
         I/O Timings: read=2.230
 Total runtime: 2.305 ms

and `SHOW shared_buffers;`

 shared_buffers
----------------
 244MB

> I suspect the issue has to do with low shared_buffers configuration and cache misses, and maybe some costs suboptimal configuration I’ll try to find it out, if anyone can enlighten us it will be very welcomed.
>
>
> P.S. Meanwhile If you still need 'time' index, you can create an index using ‘time' and ‘customer_id' in this order.

Did you mean an index on (time, driver_id)? I did:

CREATE INDEX CONCURRENTLY ix_updates_time_driver_id ON updates
("time", driver_id)

but seems like the planner will use it for driver_id having more than
~300k rows:

 Limit  (cost=0.11..0.79 rows=1 width=56) (actual
time=115.051..115.052 rows=1 loops=1)
   Buffers: shared hit=20376
   ->  Index Scan Backward using ix_updates_time_driver_id on updates
(cost=0.11..302189.90 rows=443924 width=56) (actual
time=115.048..115.048 rows=1 loops=1)
         Index Cond: (driver_id = 30132)
         Buffers: shared hit=20376
 Total runtime: 115.091 ms

It does seem faster than when having an index on just "time", but
still not optimal.

Really appreciate everyone's help!

-- 
Eric Jiang, DoubleMap
eric@xxxxxxxxxxxxx | www.doublemap.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux