Part of a larger problem, I'm trying to optimize a rather simple query which is basically:
SELECT * FROM table WHERE indexed_column > ... ORDER BY indexed_column DESC;
(see attachment for all details: table definition, query, query plans)
For small ranges it will choose an index scan which is very good. For somewhat larger ranges (not very large yet) it will switch to a bitmap scan + sorting. Pgsql probably thinks that the larger the range, the better a bitmap scan is because it reads more effectively. However, in my case, the larger the query, the worse bitmap+sort performs compared to index scan:
Small range (5K rows): 5.4 msec (b+s) vs 3.3 msec (i) -- performance penalty of ~50%
Large range (1.5M rows): 6400 sec (b+s) vs 2100 msec (i) -- performance penalty of ~200%
How can I make pgsql realize that it should always pick the index scan?
Thanks!
Kind regards,
Mathieu
shs-dev=# \d log_event Table "public.log_event" Column | Type | Modifiers -----------------+--------------------------+-------------------------------------------------------- id | bigint | not null default nextval('log_event_id_seq'::regclass) user_id | integer | ip | inet | not null action_id | integer | not null object1_id | integer | object2_id | integer | event_timestamp | timestamp with time zone | not null data | text | comments | text | Indexes: "log_event_pkey" PRIMARY KEY, btree (id) "log_event_action_id_idx" btree (action_id) "log_event_object1_idx" btree (object1_id) "log_event_object2_idx" btree (object2_id) "log_event_timestamp_idx" btree (event_timestamp) "log_event_user_id_idx" btree (user_id) Foreign-key constraints: "log_event_action_id_fkey" FOREIGN KEY (action_id) REFERENCES config.log_action(id) ======================================================================================================================================================= CASE 1: PostgreSQL switches less efficient plan with bigger range *** INTERVAL 4 hours --- uses index scan shs-dev=# explain analyze select * from log_event where event_timestamp > NOW() - interval '4 hour' order by event_timestamp desc; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan Backward using log_event_timestamp_idx on log_event (cost=0.00..9405.04 rows=3421 width=338) (actual time=0.047..2.510 rows=3507 loops=1) Index Cond: (event_timestamp > (now() - '04:00:00'::interval)) Total runtime: 2.767 ms *** INTERVAL 5 hours --- uses bitmap scan + sort shs-dev=# explain analyze select * from log_event where event_timestamp > NOW() - interval '5 hour' order by event_timestamp desc; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=11990.00..12001.14 rows=4456 width=338) (actual time=4.751..5.122 rows=4929 loops=1) Sort Key: event_timestamp Sort Method: quicksort Memory: 578kB -> Bitmap Heap Scan on log_event (cost=87.12..11719.93 rows=4456 width=338) (actual time=0.585..2.443 rows=4929 loops=1) Recheck Cond: (event_timestamp > (now() - '05:00:00'::interval)) -> Bitmap Index Scan on log_event_timestamp_idx (cost=0.00..86.01 rows=4456 width=0) (actual time=0.530..0.530 rows=4929 loops=1) Index Cond: (event_timestamp > (now() - '05:00:00'::interval)) Total runtime: 5.431 ms ======================================================================================================================================================= CASE 2: Comparison in run time for small range *** INTERVAL 6 hours -- uses bitmap scan + sort shs-dev=# explain analyze select * from log_event where event_timestamp > NOW() - interval '6 hour' order by event_timestamp desc; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=13085.87..13098.44 rows=5026 width=338) (actual time=4.611..5.015 rows=5507 loops=1) Sort Key: event_timestamp Sort Method: quicksort Memory: 623kB -> Bitmap Heap Scan on log_event (cost=95.54..12776.89 rows=5026 width=338) (actual time=0.626..2.034 rows=5507 loops=1) Recheck Cond: (event_timestamp > (now() - '06:00:00'::interval)) -> Bitmap Index Scan on log_event_timestamp_idx (cost=0.00..94.28 rows=5026 width=0) (actual time=0.566..0.566 rows=5507 loops=1) Index Cond: (event_timestamp > (now() - '06:00:00'::interval)) Total runtime: 5.375 ms ** INTERVAL 6 hours WITHOUT sort (enable_sort = off) -- uses index scan shs-dev=# explain analyze select * from log_event where event_timestamp > NOW() - interval '6 hour' order by event_timestamp desc; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan Backward using log_event_timestamp_idx on log_event (cost=0.00..13446.79 rows=5032 width=338) (actual time=0.051..2.948 rows=5515 loops=1) Index Cond: (event_timestamp > (now() - '06:00:00'::interval)) Total runtime: 3.337 ms ======================================================================================================================================================= CASE 3: Comparison in run time for large range *** INTERVAL 1 year -- uses bitmap scan + sort shs-dev=# explain analyze select * from log_event where event_timestamp > NOW() - interval '1 year' order by event_timestamp desc; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Sort (cost=481763.31..485634.61 rows=1548520 width=338) (actual time=5423.628..6286.148 rows=1551923 loops=1) Sort Key: event_timestamp Sort Method: external merge Disk: 90488kB -> Seq Scan on log_event (cost=0.00..79085.92 rows=1548520 width=338) (actual time=0.022..2195.527 rows=1551923 loops=1) Filter: (event_timestamp > (now() - '1 year'::interval)) Total runtime: 6407.377 ms *** INTERVAL 1 year WITHOUT sort -- uses index scan shs-dev=# explain analyze select * from log_event where event_timestamp > NOW() - interval '1 year' order by event_timestamp desc; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan Backward using log_event_timestamp_idx on log_event (cost=0.00..2412996.64 rows=1548521 width=338) (actual time=0.052..2015.871 rows=1551926 loops=1) Index Cond: (event_timestamp > (now() - '1 year'::interval)) Total runtime: 2112.643 ms
Attachment:
postgresql.conf
Description: Binary data
-- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance