Colin McGuigan <cmcguigan@xxxxxxxxxxxxxxx> writes: > -> Subquery Scan s (cost=0.00..21.93 rows=1 width=8) > Filter: ((userid = 123456) AND (locationid IS NULL)) > -> Limit (cost=0.00..15.30 rows=530 width=102) > -> Seq Scan on staff (cost=0.00..15.30 rows=530 width=102) There does seem to be a bug here, but not the one you think: the rows=1 estimate for the subquery node seems a bit silly given that it knows there are 530 rows in the underlying query. I'm not sure how bright the code is about finding stats for variables emitted by a subquery, but even with totally default estimates it should not come up with a selectivity of 1/500 for the filter. Unfortunately, fixing that is likely to bias it further away from the plan you want ... > Furthermore, I can repeat this experiment over and over, so I know that > its not caching. You mean it *is* caching. > I'd really prefer this query run in < 1 second rather than > 45, but I'd > really like to do that without having hacks like adding in pointless > LIMIT clauses. The right way to do it is to adjust the planner cost parameters. The standard values of those are set on the assumption of tables-much-bigger-than-memory, a situation in which the planner's preferred plan probably would be the best. What you are testing here is most likely a situation in which the whole of both tables fits in RAM. If that pretty much describes your production situation too, then you should decrease seq_page_cost and random_page_cost. I find setting them both to 0.1 produces estimates that are more nearly in line with true costs for all-in-RAM situations. (Pre-8.2, there's no seq_page_cost, so instead set random_page_cost to 1 and inflate all the cpu_xxx cost constants by 10.) regards, tom lane