Re: Understanding bad estimate (related to FKs?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 29, 2020 at 11:25:48AM -0400, Philip Semanchuk wrote:


On Oct 28, 2020, at 9:13 PM, Justin Pryzby <pryzby@xxxxxxxxxxxxx>
wrote:

On Mon, Oct 26, 2020 at 11:20:01AM -0600, Michael Lewis wrote:
On Mon, Oct 26, 2020 at 11:14 AM Philip Semanchuk
<philip@xxxxxxxxxxxxxxxxxxxxx> wrote:

The item I'm focused on is node 23. The estimate is for 7 rows,
actual
is 896 (multiplied by 1062 loops). I'm confused about two things in
this node.

The first is Postgres' estimate. The condition for this index
scan
contains three expressions --

(five_uniform = zulu_five.five_uniform) AND (whiskey_mike =
juliet_india.whiskey_mike) AND (bravo = 'mike'::text)


Are the columns correlated?

I guess it shouldn't matter, since the FKs should remove all but one
of the conditions.

Yes, I had the same expectation. I thought Postgres would calculate the
selectivity as 1.0 * 1.0 * whatever estimate it has for the frequency
of ‘mike’, but since the frequency estimate is very accurate but the
planner’s estimate is not, there’s something else going on.


Well, this is quite a bit more complicated, I'm afraid :-( The clauses
include parameters passed from the nodes above the index scan. So even
if we had extended stats on the table, we couldn't use them as that
requires (Var op Const) conditions. So this likely ends up with a
product of estimates for each clause, and even then we can't use any
particular value so we probably end up with something like 1/ndistinct
or something like that. So if the values actually passed to the index
scan are more common and/or if the columns are somehow correlated, it's
not surprising we end up with an overestimate.

Maybe you saw this other thread, which I tentatively think also
affects your case (equijoin with nonjoin condition)
https://www.postgresql.org/message-id/AM6PR02MB5287A0ADD936C1FA80973E72AB190%40AM6PR02MB5287.eurprd02.prod.outlook.com

Yes, thank you, I read that thread with interest. I tried your clever
trick using BETWEEN, but it didn’t change the plan. Does that suggest
there’s some other cause for the planner’s poor estimate?


I don't think that's related - to hit that bug, there would have to be
implied conditions pushed-down to the scan level. And there's nothing
like that in this case.

FWIW I don't think this has anything to do with join cardinality
estimation - at least not for the node 23.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux