Ahh, thanks Tom. I hadn't seen your email before I posted my own followup. I guess the clustering approach managed to work around the need to mess with the statistics target. I did previously increase the target to 1000 (from 100) for that field and had no impact, but this is an aspect of tuning I'm not so familiar with - I didn't consider pushing it all the way to 11. On 11 Aug 2013, at 00:28, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Timothy Kane <tim.kane@xxxxxxxxx> writes: >> I seem to be having some grief with the 9.1.9 query planner favouring an index scan + merge join, over a sequential scan + hash join. > > I believe the reason it's preferring the merge join plan is that it thinks > the executor will be able to terminate the merge join early as a > consequence of the range of join keys in "addresses" being only a fraction > of the range of join keys in "users". Notice that the total estimated > cost for the merge join is just a fraction of the full estimated cost of > the indexscan on "users"; the only way that's possible is if the indexscan > on "users" doesn't have to run through all of the table. Probably, the > range of join keys is wider than the planner thinks and so the merge join > can't terminate early. The fix therefore is to crank the stats target for > "addresses" up high enough that you get a reasonable value in pg_statistic > for the largest address_id value (look at the last histogram entry). > >> Interestingly, on another instance of this same database running on postgres 8.3.8, the query planner correctly chooses the sequential scan method - having more sane cost estimates for the index scan method. > > I think the 8.3 planner didn't take this effect into account. Or maybe it > did, but by chance the upper histogram value is closer to reality on the > older database. > > regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general