Craig A. James wrote:
My example, discussed previously in this forum, is a classic. I have a
VERY expensive function (it's in the class of NP-complete problems, so
there is no faster way to do it). There is no circumstance when my
function should be used as a filter, and no circumstance when it should
be done before a join. But PG has no way of knowing the cost of a
function, and so the optimizer assigns the same cost to every function.
Big disaster.
The result? I can't use my function in any WHERE clause that involves
any other conditions or joins. Only by itself. PG will occasionally
decide to use my function as a filter instead of doing the join or the
other WHERE conditions first, and I'm dead.
this is an argument for cost-for-functions rather than hints AFAICS.
It seems to me that if (in addition to the function cost) we come up
with some efficient way of recording cross column statistics we would be
well on the way to silencing *most* of the demands for hints.
We would still be left with some of the really difficult problems - a
metric for "locally correlated" column distributions and a reliable
statistical algorithm for most common value sampling (or a different way
of approaching this). These sound like interesting computer science or
mathematics thesis topics, maybe we could try (again?) to get some
interest at that level?
Cheers
Mark