> > My interpretation reading the query plan is: well sized small > > batches of upserts leverage the indexes while the regular join > > choose the sequential scan, including sorting and hashing which > > takes forever time and resources including disk. > > You may be right, but it's hard to tell without seeing the query > plan. Here are part of both plans: Bad case (strategy 2.1): -> Merge Left Join (cost=530202629.03..255884257913.32 rows=17023331531230 width=579) Merge Cond: (david.list_id = ca.list_id) -> Sort (cost=2019172.91..2024398.82 rows=2090361 width=569) Sort Key: david.list_id -> Append (cost=0.00..192152.41 rows=2090361 width=569) -> Seq Scan on david_0 david_1 (cost=0.00..1812.52 rows=20852 width=569) -> Seq Scan on david_1 david_2 (cost=0.00..1800.09 rows=20709 width=569) -> Seq Scan on david_2 david_3 (cost=0.00..1794.44 rows=20644 width=569) Good case (strategy 3): Merge on goliath_23 ca (cost=2139.75..11077.17 rows=0 width=0) -> Nested Loop Left Join (cost=2139.75..11077.17 rows=1000 width=575) -> Limit (cost=2139.19..2495.67 rows=1000 width=569) -> Index Scan using david_23_list_id_account_id_idx on david_23 (cost=0.29..6794.16 rows=19058 width=569) -> Index Scan using goliath_23_list_id_account_id_idx on goliath_23 ca (cost=0.56..8.56 rows=1 width=14) Index Cond: (list_id = david_23.list_id) > > Sounds very much like you'd benefit from tuning some cost parameters > to > make the index scan look cheaper. > Not sure what 'batched indexed join' would be, but it very much > sounds > like a nested loop with an index scan. Agreed, a 2M nested loop over index scan would likely work as well. Would tuning the costs param could lead to get such large nested loop ? > What PostgreSQL version are you using, what hardware? Did you tune it > in > any way, or is everything just default? It is pg 15.3, on 2 cores / 8GO / 2TO ssds, with defaults cloud provider parameters (RDS).