Tom Lane a écrit :
Benoit Delbosc <bdelbosc@xxxxxxxxx> writes:
I am trying to understand why inside an EXISTS clause the query planner
does not use the index:
I'm not sure this plan is as bad as all that. The key point is that the
planner is expecting 52517 rows that match that users_md5 value (and the
true number is evidently 51446, so that estimate isn't far off). That's
about 1/48th of the table. It knows that the EXISTS case can stop as
soon as it finds one match, so it's betting that a plain seqscan will
hit a match faster than an index lookup would be able to, ie,
seqscanning about 48 tuples is faster than one index lookup. This might
be a bad bet if the users_md5 values are correlated with physical order,
ie the matches are not randomly scattered but are all towards the end of
the table.
exact, the data is not randomly scattered but ordered this explains why
in my case seq scan is a bad bet
Barring that, though, it could be a good bet if the table
isn't swapped in. Which is what the default cost parameters are set
up to assume.
there are lots of shared buffers and effective memory on this instance,
the query is executed many times I can assume that the table isn't
swapped in right ?
I suspect your real complaint is that you expect the table to be swapped
in, in which case what you ought to be doing is adjusting the planner's
cost parameters. Some playing around here with a similar case suggests
that even a small reduction in random_page_cost would make it prefer an
indexscan for this type of situation.
excellent !
Changing the random_page_cost from 4 to 2 do the trick
SET random_page_cost = 2;
EXPLAIN ANALYZE SELECT 1 WHERE EXISTS (SELECT 1 FROM read_acls_cache
WHERE users_md5 = '9bc9012eb29c0bb2ae3cc7b5e78c2acf');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=1.06..1.07 rows=1 width=0) (actual time=0.048..0.048
rows=1 loops=1)
One-Time Filter: $0
InitPlan 1 (returns $0)
-> Index Scan using read_acls_cache_users_md5_idx on
read_acls_cache (cost=0.00..55664.21 rows=52517 width=0) (actual
time=0.045..0.045 rows=1 loops=1)
Index Cond: ((users_md5)::text =
'9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text)
Total runtime: 0.087 ms
(6 rows)
466/0.087 = 5360 thanks !
kind regards
ben
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance