Greetings.
--
I have a question on why planner chooses `IndexScan` for the following query:
SELECT la.loan_id, la.due_date, la.is_current
FROM loan_agreements la WHERE la.is_current AND '2016-08-11' > la.due_date;
Relevant (cannot post it all, sorry) table definition is:
Column Type Modifiers
------------------------------ --------------------------- ---------
id bigint not null
...
is_current boolean not null
due_date date not null
loan_id bigint
Indexes:
"loan_agreements_pkey" PRIMARY KEY, btree (id)
...
"idx_loan_agreements_due_date" btree (due_date)
"idx_loan_agreemnets_loan_id_cond_is_current_true" btree (loan_id) WHERE is_current = true
Some stats:
SELECT relname,reltuples::numeric,relpages FROM pg_class WHERE oid IN ('loan_agreements'::regclass, 'idx_loan_agreemnets_loan_id_cond_is_current_true'::regclass, 'idx_loan_agreements_due_date'::regclass);
relname reltuples relpages
------------------------------------------------ --------- --------
idx_loan_agreements_due_date 664707 1828
idx_loan_agreemnets_loan_id_cond_is_current_true 237910 655
loan_agreements 664707 18117
Settings:
SELECT name,setting,unit FROM pg_settings WHERE name ~ '(buffers|mem|cost)$';
name setting unit
-------------------- -------- ----
autovacuum_work_mem 524288 kB
cpu_index_tuple_cost 0.005 ¤
cpu_operator_cost 0.0025 ¤
cpu_tuple_cost 0.01 ¤
maintenance_work_mem 16777216 kB
random_page_cost 2.5 ¤
seq_page_cost 1 ¤
shared_buffers 1572864 8kB
temp_buffers 8192 8kB
wal_buffers 2048 8kB
work_mem 65536 kB
PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
Planner chooses the following plan:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..16986.53 rows=226145 width=13) (actual time=0.054..462.394 rows=216530 loops=1)
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21304
Buffers: shared hit=208343 read=18399
Planning time: 0.168 ms
Execution time: 479.773 ms
If I disable IndexScans, plan changes likes this:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on loan_agreements la (cost=2884.01..23974.88 rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)
Recheck Cond: is_current
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21304
Heap Blocks: exact=18117
Buffers: shared hit=18212 read=557
-> Bitmap Index Scan on idx_loan_agreemnets_loan_id_cond_is_current_true (cost=0.00..2827.47 rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)
Buffers: shared hit=119 read=533
Planning time: 0.171 ms
Execution time: 214.341 ms
Question is — why IndexScan over partial index is estimated less than BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan over 1/3 of table is not a good thing — IndexScan is touching 10x more pages and in a typical situation those are cold.
Thanks in advance.
Victor Y. Yegorov