On 02/12/2017 01:11, Justin Pryzby wrote:
I tried to reproduce this issue and couldn't, under PG95 and 10.1:
On Fri, Dec 01, 2017 at 12:34:27PM -0600, Justin Pryzby wrote:
On Fri, Dec 01, 2017 at 07:40:08PM +0200, Vitaliy Garnashevich wrote:
We recently had an issue in production, where a bitmap scan was chosen
instead of an index scan. Despite being 30x slower, the bitmap scan had
about the same cost as the index scan.
drop table if exists aaa;
create table aaa as select (id%100)::int num, (id%10=1)::bool flag from
generate_series(1, 10000000) id;
create index i1 on aaa (num);
create index i2 on aaa (flag);
analyze aaa;
What is:
effective_io_concurrency
max_parallel_workers_per_gather (I gather you don't have this)
effective_io_concurrency = 0
max_parallel_workers_per_gather = 0
Did you notice random_page_cost = 1.5 ?
For this test I'm using SSD and Windows (if that matters). On production
we also use SSD, hence lower random_page_cost. But with the default
random_page_cost=4.0, the difference in cost between the index scan plan
and the bitmap scan plan is even bigger.
Note:
postgres=# SELECT correlation FROM pg_stats WHERE tablename='aaa' AND attname='num';
correlation | 0.00710112
..so this is different from the issue corrected by the patch I created while
testing.
Note that id%100==1 implies flag='t', so the planner anticipates retrieving
fewer rows than it will ultimately read, probably by 2x. It makes sense that
causes the index scan to be more expensive than expected, but that's only
somewhat important, since there's no joins involved.
I changed the query from COUNT(*) TO * for easier to read explain:
CREATE TABLE aaa AS SELECT (id%100)::int num, (id%10=1)::bool flag FROM generate_series(1, 10000000) id;
CREATE INDEX i1 ON aaa(num);
CREATE INDEX i2 ON aaa (flag);
ANALYZE VERBOSE aaa;
EXPLAIN (analyze,verbose,costs,buffers) SELECT * FROM aaa WHERE num=1 AND flag=true;
Bitmap Heap Scan on public.aaa (cost=20652.98..45751.75 rows=10754 width=5) (actual time=85.314..185.107 rows=100000 loops=1)
-> BitmapAnd (cost=20652.98..20652.98 rows=10754 width=0) (actual time=163.220..163.220 rows=0 loops=1)
-> Bitmap Index Scan on i1 (cost=0.00..1965.93 rows=106333 width=0) (actual time=26.943..26.943 rows=100000 loops=1)
-> Bitmap Index Scan on i2 (cost=0.00..18681.42 rows=1011332 width=0) (actual time=133.804..133.804 rows=1000000 loops=1)
..which is what's wanted with no planner hints (PG10.1 here).
Yes, that's what you get without planner hints, but it's strange to get
this plan, when there is another one, which runs 2-3 times faster, but
happens to be estimated to be twice more costly than the one with bitmap
scans:
# set enable_bitmapscan = off; set enable_indexscan = on; set
enable_seqscan = off;
# explain analyze select * from aaa where num = 1 and flag = true;
Index Scan using i1 on aaa (cost=0.44..66369.81 rows=10428 width=5)
(actual time=0.020..57.765 rows=100000 loops=1)
vs.
# set enable_bitmapscan = on; set enable_indexscan = off; set
enable_seqscan = off;
# explain analyze select * from aaa where num = 1 and flag = true;
Bitmap Heap Scan on aaa (cost=13099.33..25081.40 rows=10428 width=5)
(actual time=122.137..182.811 rows=100000 loops=1)
-> BitmapAnd (cost=13099.33..13099.33 rows=10428 width=0) (actual
time=110.168..110.168 rows=0 loops=1)
-> Bitmap Index Scan on i1 (cost=0.00..1181.44 rows=101667
width=0) (actual time=20.845..20.845 rows=100000 loops=1)
-> Bitmap Index Scan on i2 (cost=0.00..11912.43 rows=1025666
width=0) (actual time=80.323..80.323 rows=1000000 loops=1)
Same on PG95:
postgres=# EXPLAIN (analyze,verbose,costs,buffers) SELECT * FROM aaa WHERE num=1 AND flag=true;
Bitmap Heap Scan on public.aaa (cost=19755.64..43640.32 rows=9979 width=5) (actual time=230.017..336.583 rows=100000 loops=1)
-> BitmapAnd (cost=19755.64..19755.64 rows=9979 width=0) (actual time=205.242..205.242 rows=0 loops=1)
-> Bitmap Index Scan on i1 (cost=0.00..1911.44 rows=103334 width=0) (actual time=24.911..24.911 rows=100000 loops=1)
-> Bitmap Index Scan on i2 (cost=0.00..17838.96 rows=965670 width=0) (actual time=154.237..154.237 rows=1000000 loops=1)
The rowcount is off, but not a critical issue without a join.
Justin