Re: Yet another abort-early plan disaster on 9.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 30, 2014 at 11:54 AM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
> On Mon, Sep 29, 2014 at 7:12 PM, Gavin Flower
> <GavinFlower@xxxxxxxxxxxxxxxxx> wrote:
>>
>>
>> Would it be feasible to get a competent statistician to advise what data
>> to collect, and to analyze it?  Maybe it is possible to get a better
>> estimate on how much of a table needs to be scanned, based on some fairly
>> simple statistics.  But unless research is done, it is probably impossible
>> to determine what statistics might be useful, and how effective a better
>> estimate could be.
>>
>> I have a nasty feeling that assuming a uniform distribution, may still end
>> up being the best we can do - but I maybe being unduly pessimistic!.
>
> As a semi-competent statistician, my gut feeling is that our best bet would
> be not to rely on the competence of statisticians for too much, and instead
> try to give the executor the ability to abandon a fruitless path and pick a
> different plan instead. Of course this option is foreclosed once a tuple is
> returned to the client (unless the ctid is also cached, so we can make sure
> not to send it again on the new plan).
>
> I think that the exponential explosion of possibilities is going to be too
> great to analyze in any rigorous way.

Call it the 'Parking in Manhattan' strategy -- you know when it's time
to pull forward when you've smacked into the car behind you.

Kidding aside, this might be the path forward since it's A. more
general and can catch all kinds of problem cases that our statistics
system won't/can't catch and B. At least in my case it seems like more
complicated plans tend to not return much data until the inner most
risky parts have been involved.  Even if that wasn't the case,
withholding data to the client until a user configurable time
threshold had been passed (giving the planner time to back up if
necessary) would be a reasonable user facing tradeoff via GUC:
'max_planner_retry_time'.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux