Re: increasing effective_cache_size slows down join queries by a factor of 4000x

Tomas Vondra <tomas.vondra@xxxxxxxxxxxxxxxx> · Thu, 3 Feb 2022 00:34:02 +0100

On 2/2/22 22:10, Artyom Shaposhnikov wrote:
the row estimate became ~1000x smaller with the stat in place, so it
looks like it grossly miscalculates the query plans without the stats
for large tables representing M:M relations.

Well, if the estimates are significantly off (and 3 orders of magnitude 
certainly qualifies), then all bets are off. There's no magical option 
that'd fix planning in such conditions.

Ultimately, fixing the estimates (e.g. by creating extended statistics) 
is the right "fix" as it gives the optimizer the information needed to 
pick the right plan.

On Wed, Feb 2, 2022 at 11:47 AM Michael Lewis <mlewis@xxxxxxxxxxx> wrote:

What does the row estimate look like on the scan of data table with that statistic in place? Anytime the stats give a mis-estimate this far off, I wouldn't expect that plans would be optimal except by luck.

Index Scan using data_pkey on data t (cost=0.57..21427806.53 rows=58785023 width=131) (actual time=0.024..0.482 rows=854 loops=1)
Index Cond: (id > 205284974)

It'd be interesting to see the plans without the LIMIT, as that makes 
the "actual" values low simply by terminating early.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company