On 03/05/13 00:27, Simon Riggs wrote:
Would be practicable to have a facility for telling Postgres in advance what you intend to do, so it can create plans accordingly?On 2 May 2013 01:49, Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx> wrote:On 02/05/13 02:06, Tom Lane wrote:Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx> writes:I am concerned that the deafening lack of any replies to my original message is a result of folk glancing at your original quick reply and thinking... incomplete problem spec...ignore... when that is not that case - yes I should have muttered "9.2" in the original email, but we have covered that now.No, I think it's more that we're trying to get to beta, and so anything that looks like new development is getting shuffled to folks' "to look at later" queues. The proposed patch is IMO a complete nonstarter anyway; but I'm not sure what a less bogus solution would look like.Yeah, I did think that beta might be consuming everyone's attention (of course immediately *after* sending the email)! And yes, the patch was merely to illustrate the problem rather than any serious attempt at a solution.I think we need a problem statement before we attempt a solution, which is what Tom is alluding to. ISTM that you've got a case where the plan is very sensitive to a table load. Which is a pretty common situation and one that can be solved in various ways. I don't see much that Postgres can do because it can't know ahead of time you're about to load rows. We could imagine an optimizer that set thresholds on plans that caused the whole plan to be recalculated half way thru a run, but that would be a lot of work to design and implement and even harder to test. Having static plans at least allows us to discuss what it does after the fact with some ease. The plan is set using stats that are set when there are very few non-NULL rows, and those increase massively on load. The way to cope is to run the ANALYZE immediately after the load and then don't allow auto-ANALYZE to reset them later. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services I won't try and invent syntax, but it would be good to tell the system that you intend to: insert a million rows sequentially or insert ten million rows randomly or update two million rows in the primary key range 'AA00000' to 'PP88877' etc. Though, sometime it may be useful to give a little more detail,especially if you have a good estimate of the distribution of primary keys, e.g.: AA000000 20% AA000456 0% KE700999 30% NN400005 35% PA000001 15% PP808877 I figure that if the planner had more information about what one intends to do, then it could combine that with the statistics it knows, to come up with a more realistic plan. Cheers, Gavin |