Kevin Grittner <kgrittn@xxxxxxxxx> writes: > Michael Kolomeitsev <mkolomeitsev@xxxxxxxxx> wrote: >> it is clear for me why t1_b_a_idx is better. The question is: Is >> postgresql able to see that? > For a number of reasons I never consider a bulk load complete until > I run VACUUM FREEZE ANALYZE on the table(s) involved. When I try > your test case without that, I get the bad index choice. When I > then run VACUUM FREEZE ANALYZE on the database I get the good index > choice. I think that's just chance, because AFAICS the cost estimates are exactly the same for both indexes, once you've done the vacuum to make all the heap pages all-visible. What's more, I'm not sure that that's wrong, because according to EXPLAIN (ANALYZE, BUFFERS) the exact same number of index pages are touched for either index. So I think Michael's claim that the one index is better is at best unproven. regression=# explain (analyze, buffers) select count(*) from t1 where a in (1, 9, 17, 26, 35, 41, 50) and b = 333333; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=32.12..32.13 rows=1 width=0) (actual time=0.097..0.098 rows=1 loops=1) Buffers: shared hit=30 -> Index Only Scan using t1_b_a_idx on t1 (cost=0.57..32.10 rows=7 width=0) (actual time=0.044..0.085 rows=7 loops=1) Index Cond: ((b = 333333) AND (a = ANY ('{1,9,17,26,35,41,50}'::integer[]))) Heap Fetches: 0 Buffers: shared hit=30 Total runtime: 0.174 ms (7 rows) regression=# begin; drop index t1_b_a_idx; BEGIN DROP INDEX regression=# explain (analyze, buffers) select count(*) from t1 where a in (1, 9, 17, 26, 35, 41, 50) and b = 333333; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=32.12..32.13 rows=1 width=0) (actual time=0.110..0.110 rows=1 loops=1) Buffers: shared hit=30 -> Index Only Scan using t1_a_b_idx on t1 (cost=0.57..32.10 rows=7 width=0) (actual time=0.039..0.101 rows=7 loops=1) Index Cond: ((a = ANY ('{1,9,17,26,35,41,50}'::integer[])) AND (b = 333333)) Heap Fetches: 0 Buffers: shared hit=30 Total runtime: 0.199 ms (7 rows) regression=# abort; ROLLBACK I grant the theory that the repeated index probes in t1_b_a_idx should be more localized than those in t1_a_b_idx, but PG's optimizer doesn't attempt to estimate such effects, and this example isn't doing much to convince me that it'd be worth the trouble. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance