Re: Help with row estimate problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/7/2024 22:51, Jon Zeppieri wrote:
Of course, I'd prefer not to have to materialize this relation
explicitly. This particular query, for this particular user, benefits
from it, but similar queries or queries for different users may not.

I think the root of the problem is that population size (i.e., the
number of members in a given population) has a high variance, and then
planner is basing its estimates on the average population size (and
maybe the average number of populations to which a user has access?),
which is not especially useful. Is there anything I can do about this?
Would any extended statistics be useful here?
Thanks for report. I see such cases frequently enough and the key problem here is data skew, as you already mentioned. Extended statistics doesn't help here. Also, because we can't estimate specific values coming from the outer NestLoop - we can't involve MCV to estimate selectivity of the population. That's the reason why the optimiser uses ndistinct value. What you can do? I see only one option - split the table to some partitions where data will be distributed more or less uniformly. And invent a criteria for pruning unnecessary partitions. Of course, you can also try pg_hint_plan and force planner to use MergeJoin or HashJoin in that suspicious case.

--
regards, Andrei Lepikhov






[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux