On Mon, Nov 17, 2014 at 4:27 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Robert Haas <robertmhaas@xxxxxxxxx> writes: >> On Thu, Nov 13, 2014 at 7:34 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: >>> One thing that occurs to me is that if the generic plan estimate comes >>> out much cheaper than the custom one, maybe we should assume that the >>> generic's cost estimate is bogus. Right offhand I can't think of a reason >>> for a custom plan to look worse than a generic one, unless there's a >>> statistical quirk like this one. > >> That's an interesting idea, but what do we do after deciding that it's >> bogus? > > Keep using custom plans. It's possible that the estimate that's in error > is the custom one, but that's not the way to bet IMO, since the custom > plan estimate is based on better information. > >> The generic plan really can't be cheaper than the custom plan, >> but it could be the same price, or as close as makes no difference. > > Right, and what we want to do is use the generic plan as long as it's > close to the same cost (close enough to not justify replanning effort). > The trick here is to not be fooled by estimation errors. Can we assume > that generic cost < custom cost is always an estimation error? Maybe. It seems like kind of a fragile bet to me. There's going to be some qual selectivity below which an index scan on a particular table outperforms a sequential scan, but the selectivity estimated for a generic plan can be either higher or lower than the selectivity we'd estimate for some particular value. And once one of the two plans decides on an index scan while the other one divides on a sequential scan, it can cascade through and change the whole plan - e.g. because it affects whether the tuples emerge with usable pathkeys. I don't feel very confident about assuming that applying < to the result of all that is going to tell us anything useful. I think what's really going on here is that the generic plan will be optimal for some range of possible qual selectivities. Typically, the qual is of the form col = val, so the plan will be optimal for all values where the estimated frequency is between some values A and B. What would be nice is to notice when we see a value that is outside that range, and switch to a custom plan in that case. I realize that the planner isn't really well set up to deliver the information we'd need to test that for each new supplied value, but that's really what we need to do. The current system wastes CPU cycles by replanning up to 5 times even when there is no benefit to be gained by it, but can also cause big performance problems when it settles into a generic plan and then a value with different characteristics shows up later on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general