Branden Visser <mrvisser@xxxxxxxxx> writes: > I just wanted to update that I've found evidence that fixing the > planner row estimation may not actually influence it to use the more > performant merge join instead of hash join. I have found instances > where the row estimation is *overestimated* by a magnitude of 4x > (estimates 2.4m rows) and still chooses hash join over merge join, > where merge join is much faster (45s v.s. 12s). I wonder why the merge join is faster exactly. It doesn't usually have a huge benefit unless the inputs are presorted already. The one case I can think of where it can win quite a lot is if the range of merge keys in one input is such that we can skip reading most of the other input. (Extreme example: one input has keys 1..10, but the other input has keys 1..10000. We only need to read the first 1% of the second input, assuming there's an index on its key column so that we don't have to read the whole thing anyway to sort it.) The planner is aware of that effect, but I wonder if it's misestimating it for some reason. Anyway it would be worth looking closely at your EXPLAIN ANALYZE results to determine whether early-stop is happening or not. It'd manifest as one join input node showing an actual number of rows returned that's less than you'd expect. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general