On 2/15/22 01:06, A Shaposhnikov wrote:
Interestingly I have a second PG 14.2 database, with identical table
definitions, but about 10% smaller row counts, and the exact same query
works fast there without the 2nd condition:
Are you sure about the 10%? Because in the plans from the first machine
I see this:
>
> -> Index Scan using team_pkey on team t (cost=0.57..11382381.88
> rows=78693167 width=175) (actual time=0.016..0.695 rows=854 loops=1)
>
while the second machine does this:
-> Index Scan using team_pkey on team t (cost=0.57..2366113.83
rows=2807531 width=160) (actual time=0.031..0.801 rows=888 loops=1)
That's 2.8M vs. 78M, quite far from "10% difference". Not sure about
team_aliases table, that's imposible to say from the plans.
This may matter a lot, because we use effective cache size to calculate
cache hit ratio for the query, with relation sizes as an input. So
smaller relations (or larger effective_cache_size) means cheaper random
I/O, hence preference for nested loop join.
The other thing is data distribution - that may matter too.
IMO it's pointless to investigate this further - we know what's causing
the issue. The optimizer is oblivious that merge join will have to skip
large part of the second input, due to the implicit condition. Notice
that adding the condition changes the cost from:
Limit (cost=81.33..331.82 rows=1000 width=183) ...
to
Limit (cost=81.33..720.48 rows=1000 width=183) ...
So it seems *more* expensive than the first plan. Taken to the extreme
the planner could theoretically have chosen to use the first plan (and
delay the condition until after the join).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company