I have the following tables:
- m(pk bigserial primary key, status text): with a single row
- s(pk bigserial primary key, status text, action_at date, m_fk bigint):
* 80% of the data has action_at between the current date and 1 year ago
and status of E or C
* 20% of the data has action_at between 5 days ago and 25 days into the
future and status of P, PD, or A
I have two partial indexes:
- s_pk_action_at on s(pk, action_at) where status in ('P', 'PD', 'A')
- s_action_at_pk on s(action_at, pk) where status in ('P', 'PD', 'A')
With the query:
SELECT s.pk FROM s
INNER JOIN m ON m.pk = s.m_fk
WHERE
s.status IN ('A', 'PD', 'P')
AND (action_at <= '2018-09-06')
AND s.status IN ('A', 'P')
AND m.status = 'A';
I generally expect the index s_action_at_pk to always be preferred over s_pk_action_at. And on stock Postgres it does in fact use that index (with a bitmap index scan).
We like to set random_page_cost = 2 since we use fast SSDs only. With that change Postgres strongly prefers the index s_pk_action_at unless I both disable the other index and turn off bitmap heap scans.
I'm attaching the following plans:
- base_plan.txt: default costs; both indexes available
- base_plan_rpc2.txt: random_page_cost = 2; both indexes available
- inddisabled_plan_rpc2.txt: random_page_cost = 2; only s_action_at_pk available
- inddisabled_bhsoff_plan_rpc2.txt: random_page_cost = 2; enable_bitmapscan = false; only s_action_at_pk available
A couple of questions:
- How is s_pk_action_at ever efficient to scan? Given that the highest cardinality (primary key) column is first, wouldn't an index scan effectively have to scan the entire index?
- Why does index scan on s_action_at_pk reads over 2x as many blocks as the bitmap heap scan with the same index?
- Would you expect Postgres to generally always prefer using the s_action_at_pk index over the s_pk_action_at index for this query? I realize changing the random page cost is part of what's driving this, but I still can't imagine reading the full s_pk_action_at index (assuming that's what it is doing) could ever be more valuable.
As a side note, the planner is very bad at understanding a query that happens (I realize you wouldn't write this by hand, but ORMs) when you have a where clause like:
s.status IN ('A', 'PD', 'P') AND s.status IN ('A', 'P')
the row estimates are significantly different from a where clause with only:
s.status IN ('A', 'P')
even though semantically those are identical.
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=0.42..8795.84 rows=11004 width=8) (actual time=0.020..12.338 rows=8919 loops=1) Join Filter: (s.m_fk = m.pk) Buffers: shared hit=8234 -> Seq Scan on m (cost=0.00..1.01 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=1) Filter: (status = 'A'::text) Buffers: shared hit=1 -> Index Scan using s_action_at_pk on s (cost=0.42..8657.28 rows=11004 width=16) (actual time=0.014..11.008 rows=8919 loops=1) Index Cond: (action_at <= '2018-09-06'::date) Filter: (status = ANY ('{A,P}'::text[])) Rows Removed by Filter: 4483 Buffers: shared hit=8233 Planning Time: 0.502 ms Execution Time: 12.793 ms (13 rows)
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=1900.77..6768.34 rows=11004 width=8) (actual time=1.345..11.812 rows=8919 loops=1) Join Filter: (s.m_fk = m.pk) Buffers: shared hit=3184 -> Seq Scan on m (cost=0.00..1.01 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=1) Filter: (status = 'A'::text) Buffers: shared hit=1 -> Bitmap Heap Scan on s (cost=1900.77..6629.78 rows=11004 width=16) (actual time=1.337..10.515 rows=8919 loops=1) Recheck Cond: ((action_at <= '2018-09-06'::date) AND (status = ANY ('{P,PD,A}'::text[]))) Filter: (status = ANY ('{A,P}'::text[])) Rows Removed by Filter: 4483 Heap Blocks: exact=3129 Buffers: shared hit=3183 -> Bitmap Index Scan on s_action_at_pk (cost=0.00..1898.02 rows=82347 width=0) (actual time=0.995..0.995 rows=13402 loops=1) Index Cond: (action_at <= '2018-09-06'::date) Buffers: shared hit=54 Planning Time: 0.719 ms Execution Time: 12.288 ms (17 rows)
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=0.42..5768.94 rows=11004 width=8) (actual time=0.062..14.308 rows=8919 loops=1) Join Filter: (s.m_fk = m.pk) Buffers: shared hit=3130 read=386 -> Seq Scan on m (cost=0.00..1.01 rows=1 width=8) (actual time=0.004..0.006 rows=1 loops=1) Filter: (status = 'A'::text) Buffers: shared hit=1 -> Index Scan using s_pk_action_at on s (cost=0.42..5630.38 rows=11004 width=16) (actual time=0.056..13.012 rows=8919 loops=1) Index Cond: (action_at <= '2018-09-06'::date) Filter: (status = ANY ('{A,P}'::text[])) Rows Removed by Filter: 4483 Buffers: shared hit=3129 read=386 Planning Time: 0.704 ms Execution Time: 14.843 ms (13 rows)
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=1260.77..6128.34 rows=11004 width=8) (actual time=1.375..11.957 rows=8919 loops=1) Join Filter: (s.m_fk = m.pk) Buffers: shared hit=3184 -> Seq Scan on m (cost=0.00..1.01 rows=1 width=8) (actual time=0.003..0.004 rows=1 loops=1) Filter: (status = 'A'::text) Buffers: shared hit=1 -> Bitmap Heap Scan on s (cost=1260.77..5989.78 rows=11004 width=16) (actual time=1.369..10.549 rows=8919 loops=1) Recheck Cond: ((action_at <= '2018-09-06'::date) AND (status = ANY ('{P,PD,A}'::text[]))) Filter: (status = ANY ('{A,P}'::text[])) Rows Removed by Filter: 4483 Heap Blocks: exact=3129 Buffers: shared hit=3183 -> Bitmap Index Scan on s_action_at_pk (cost=0.00..1258.02 rows=82347 width=0) (actual time=1.026..1.026 rows=13402 loops=1) Index Cond: (action_at <= '2018-09-06'::date) Buffers: shared hit=54 Planning Time: 0.500 ms Execution Time: 12.437 ms (17 rows)
Attachment:
query.sql
Description: application/sql
Attachment:
setup.sql
Description: application/sql