Hi,
I have a table tpoint near that 820Mo with 700K lignes
I have created 2 index on this table :
CREATE INDEX idx_small_index ON public.tpoint USING btree (match_id);
-- Size : 4560 kB
CREATE INDEX idx_big_index ON public.tpoint USING btree (version, match_id, playnum, code_action, num_balle );
-- Size : 34 MB
If I execute this request
EXPLAIN(analyse, buffers)
SELECT count(*) FROM tpoint
WHERE match_id = 'SM001'
The query planner use the idx_small_index as expected
-> Index Only Scan using idx_small_index on tpoint (cost=0.42..507.10 rows=624 width=0) (actual time=0.025..0.160 rows=1017 loops=1)
Index Cond: (match_id = 'SM001'::bpchar)
Heap Fetches: 199
Buffers: shared hit=45
But if I execute this other request
EXPLAIN(analyse, buffers)
SELECT count(*) FROM tpoint
WHERE match_id LIKE 'SM001%'
The query planner use idx_big_index
-> Index Only Scan using idx_big_index on tpoint (cost=0.42..73940.37 rows=5191 width=0) (actual time=111.014..143.379 rows=1017 loops=1)
Filter: (match_id ~~ 'SM001%'::text)
Rows Removed by Filter: 636819
Heap Fetches: 132426
Buffers: shared hit=473963
I really don't understand why the planner prefers to use the "big index" instead of the "small" one that containt all needed element to do an INDEX ONLY SCAN.
If I deactivate the "big index"
UPDATE pg_index SET indisvalid = false WHERE indexrelid = 'idx_big_index'::regclass
I can see that for a first explain :
-> Index Only Scan using idx_small_index on tpoint (cost=0.42..92107.62 rows=5191 width=0) (actual time=59.980..78.683 rows=1017 loops=1)
Filter: (match_id ~~ 'SM001%'::text)
Rows Removed by Filter: 636819
Heap Fetches: 132426
Buffers: shared hit=27668 read=564
or that for a second explain (No read):
-> Index Only Scan using idx_small_index on tpoint (cost=0.42..92107.62 rows=5191 width=0) (actual time=64.012..81.122 rows=1017 loops=1)
Filter: (match_id ~~ 'SM001%'::text)
Rows Removed by Filter: 636819
Heap Fetches: 132426
Buffers: shared hit=28232
Tests realized on a docker postgres
version = PostgreSQL 15.0 (Debian 15.0-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
SET max_parallel_workers_per_gather TO 0
Thanks for any explanation
regards
I have a table tpoint near that 820Mo with 700K lignes
I have created 2 index on this table :
CREATE INDEX idx_small_index ON public.tpoint USING btree (match_id);
-- Size : 4560 kB
CREATE INDEX idx_big_index ON public.tpoint USING btree (version, match_id, playnum, code_action, num_balle );
-- Size : 34 MB
If I execute this request
EXPLAIN(analyse, buffers)
SELECT count(*) FROM tpoint
WHERE match_id = 'SM001'
The query planner use the idx_small_index as expected
-> Index Only Scan using idx_small_index on tpoint (cost=0.42..507.10 rows=624 width=0) (actual time=0.025..0.160 rows=1017 loops=1)
Index Cond: (match_id = 'SM001'::bpchar)
Heap Fetches: 199
Buffers: shared hit=45
But if I execute this other request
EXPLAIN(analyse, buffers)
SELECT count(*) FROM tpoint
WHERE match_id LIKE 'SM001%'
The query planner use idx_big_index
-> Index Only Scan using idx_big_index on tpoint (cost=0.42..73940.37 rows=5191 width=0) (actual time=111.014..143.379 rows=1017 loops=1)
Filter: (match_id ~~ 'SM001%'::text)
Rows Removed by Filter: 636819
Heap Fetches: 132426
Buffers: shared hit=473963
I really don't understand why the planner prefers to use the "big index" instead of the "small" one that containt all needed element to do an INDEX ONLY SCAN.
If I deactivate the "big index"
UPDATE pg_index SET indisvalid = false WHERE indexrelid = 'idx_big_index'::regclass
I can see that for a first explain :
-> Index Only Scan using idx_small_index on tpoint (cost=0.42..92107.62 rows=5191 width=0) (actual time=59.980..78.683 rows=1017 loops=1)
Filter: (match_id ~~ 'SM001%'::text)
Rows Removed by Filter: 636819
Heap Fetches: 132426
Buffers: shared hit=27668 read=564
or that for a second explain (No read):
-> Index Only Scan using idx_small_index on tpoint (cost=0.42..92107.62 rows=5191 width=0) (actual time=64.012..81.122 rows=1017 loops=1)
Filter: (match_id ~~ 'SM001%'::text)
Rows Removed by Filter: 636819
Heap Fetches: 132426
Buffers: shared hit=28232
Tests realized on a docker postgres
version = PostgreSQL 15.0 (Debian 15.0-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
SET max_parallel_workers_per_gather TO 0
Thanks for any explanation
regards