Re: Yet another abort-early plan disaster on 9.3

Simon Riggs <simon@xxxxxxxxxxxxxxx> · Thu, 2 Oct 2014 09:19:30 +0100

On 1 October 2014 19:56, Josh Berkus <josh@xxxxxxxxxxxx> wrote:
> On 09/30/2014 04:01 PM, Simon Riggs wrote:
>> On 30 September 2014 18:28, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
>>
>>>> Anyway, in the particular case I posted fixing n_distinct to realistic
>>>> numbers (%) fixed the query plan.
>>>
>>>
>>> But wouldn't fixing the absolute number also have fixed the plan?
>>
>> There are two causes of this issue.
>>
>> 1. Poor estimates of n_distinct. Fixable by user.
>>
>> 2. Poor assumption of homogeneous distribution. No way for user to
>> fix. Insufficient stats detail to be able to solve in current planner.
>>
>> I see (2) as the main source of issues, since as we observe, (1) is fixable.
>
> I disagree that (1) is not worth fixing just because we've provided
> users with an API to override the stats.  It would unquestionably be
> better for us to have a better n_distinct estimate in the first place.
> Further, this is an easier problem to solve, and fixing n_distinct
> estimates would fix a large minority of currently pathological queries.
>  It's like saying "hey, we don't need to fix the leak in your radiator,
> we've given you a funnel in the dashboard you can pour water into."

Having read papers on it, I believe the problem is intractable. Coding
is not the issue. To anyone: please prove me wrong, in detail, with
references so it can be coded.

> I do agree that (2) is worth fixing *as well*.  In a first
> approximation, one possibility (as Tom suggests) would be to come up
> with a mathematical model for a selectivity estimate which was somewhere
> *between* homogenous distribution and the worst case.  While that
> wouldn't solve a lot of cases, it would be a start towards having a
> better model.

This may have a reasonable solution, but I don't know it. A more
accurate mathematical model will still avoid the main problem: it is a
guess, not certain knowledge and the risk will still remain.

>>> I don't think correlation is up to the task as a complete solution, although
>>> it might help a little.  There is no way a simple correlation can encode
>>> that John retired 15 years ago and hasn't logged on since, while Johannes
>>> was hired yesterday and never logged on before then.
>>
>> Ah, OK, essentially the same example.
>>
>> Which is why I ruled out correlation stats based approaches and
>> suggested a risk-weighted cost approach.
>
> By "risk-weighted" you mean just adjusting cost estimates based on what
> the worst case cost looks like, correct?  That seemed to be your
> proposal from an earlier post.  If so, we're in violent agreement here.

I proposed a clear path for this earlier in the thread and received no
comments as yet. Please look at that.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance