On 09/30/2014 04:01 PM, Simon Riggs wrote: > On 30 September 2014 18:28, Jeff Janes <jeff.janes@xxxxxxxxx> wrote: > >>> Anyway, in the particular case I posted fixing n_distinct to realistic >>> numbers (%) fixed the query plan. >> >> >> But wouldn't fixing the absolute number also have fixed the plan? > > There are two causes of this issue. > > 1. Poor estimates of n_distinct. Fixable by user. > > 2. Poor assumption of homogeneous distribution. No way for user to > fix. Insufficient stats detail to be able to solve in current planner. > > I see (2) as the main source of issues, since as we observe, (1) is fixable. I disagree that (1) is not worth fixing just because we've provided users with an API to override the stats. It would unquestionably be better for us to have a better n_distinct estimate in the first place. Further, this is an easier problem to solve, and fixing n_distinct estimates would fix a large minority of currently pathological queries. It's like saying "hey, we don't need to fix the leak in your radiator, we've given you a funnel in the dashboard you can pour water into." I do agree that (2) is worth fixing *as well*. In a first approximation, one possibility (as Tom suggests) would be to come up with a mathematical model for a selectivity estimate which was somewhere *between* homogenous distribution and the worst case. While that wouldn't solve a lot of cases, it would be a start towards having a better model. >> I don't think correlation is up to the task as a complete solution, although >> it might help a little. There is no way a simple correlation can encode >> that John retired 15 years ago and hasn't logged on since, while Johannes >> was hired yesterday and never logged on before then. > > Ah, OK, essentially the same example. > > Which is why I ruled out correlation stats based approaches and > suggested a risk-weighted cost approach. By "risk-weighted" you mean just adjusting cost estimates based on what the worst case cost looks like, correct? That seemed to be your proposal from an earlier post. If so, we're in violent agreement here. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance