On Wed, 2006-09-27 at 11:48 +0200, Tobias Brox wrote: > [Tom Lane - Tue at 06:09:56PM -0400] > > If your tables are small enough to fit (mostly) in memory, then the > > planner tends to overestimate the cost of a nestloop because it fails to > > account for cacheing effects across multiple scans of the inner table. > > This is addressed in 8.2, but in earlier versions about all you can do > > is reduce random_page_cost, and a sane setting of that (ie not less than > > 1.0) may not be enough to push the cost estimates where you want them. > > Still, reducing random_page_cost ought to be your first recourse. > > Thank you. Reducing the random page hit cost did reduce the nested loop > cost significantly, sadly the merge join costs where reduced even > further, causing the planner to favor those even more than before. > Setting the effective_cache_size really low solved the issue, but I > believe we rather want to have a high effective_cache_size. > > Eventually, setting the effective_cache_size to near-0, and setting > random_page_cost to 1 could maybe be a desperate measure. Another one > is to turn off merge/hash joins and seq scans. It could be a worthwhile > experiment if nothing else :-) > > The bulk of our database is historical data that most often is not > touched at all, though one never knows for sure until the queries have > run all through - so table partitioning is not an option, it seems like. > My general idea is that nested loops would cause the most recent data > and most important part of the indexes to stay in the OS cache. Does > this make sense from an experts point of view? :-) Have you tried chaning the cpu_* cost options to see how they affect merge versus nested loop?