Tom Lane wrote:
Jack Orenstein <jack.orenstein@xxxxxxx> writes:
Limit (cost=0.00..324.99 rows=100 width=451)
-> Index Scan using t_pkey on t (cost=0.00..296027.98 rows=91088 width=451)
Index Cond: (pk > 1000000000)
Adding the value restriction at the top of this query plan wouldn't increase the
cost very much.
You're missing the point: with the value restriction in place, it's
estimating that it will probably have to scan all 91000 rows (because
there are fewer than 100 satisfying the value restriction). And that
is going to cost somewhere north of 296027 cost units --- the cost
shown, plus 91000 invocations of the value-restriction check.
Which is considerably more than the cost of the other plan.
I see -- the optimizer is calculating that it will have to examine a very large
fraction of the rows.
Actually, pk and value are highly correlated. If a row gets past the index scan,
odds are very high that the value predicate will evaluate to true. So I am sure
that the index scan is the right way to go. I'm just trying to convince the
optimizer of this.
One thing I am considering is introducing a function with high execution cost.
E.g., if I do this:
create function return_input(bytea) returns bytea
as '
begin
return $1;
end;
' language 'plpgsql' cost 10000;
explain
select *
from t
where pk > 1000000
and return_input(value = 'asdf'::bytea)
order by pk
limit 100;
then I get the plan I want.
Limit (cost=0.00..563490.32 rows=100 width=451)
-> Index Scan using t_pkey on t (cost=0.00..34226402.07 rows=6074 width=451)
Index Cond: (pk > 1000000)
Filter: (return_input(value) = 'asdf'::bytea)
Is there a more elegant way of forcing the plan that I want?
Jack
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general