Jake Magner <jakemagner90@xxxxxxxxx> writes: > I am having problems with a join where the planner picks a merge join and an > index scan on one of the tables. Manually disabling merge joins and running > the query both ways shows the merge join takes over 10 seconds while a hash > join takes less than 100ms. The planner total cost estimate favors the merge > join, but the cost estimate for the index scan part is greater than the > total cost estimate by a factor of 300x. My understanding of how this can > occur is that it expects it won't actually have to scan all the rows, > because using the histogram distribution stats it can know that all the > relevant rows of the join column will be at the beginning of the scan. But > in practice it appears to actually be index scanning all the rows, showing > massive amounts of page hits. > ... > If we change the filter from "type = 'vehicle'" (True for a small fraction > of the rows) to "freq > -1" (True for all rows) then the plan is the same, > but the actual time and page hits are much less and the query returns is > fast. I think what must be happening is that the planner notes the maximum possible value of v.id and supposes that the mergejoin will stop far short of completion because v.id spans just a small part of the range of usagestats.tid. Which it does, when you have only the nonselective filter condition on usagestats. However, the executor cannot stop until it's fetched a usagestats row that has a tid value larger than the last v.id value; otherwise it can't be sure it's emitted all the required join rows. I'm guessing that the "type = 'vehicle'" condition eliminates all such rows, or at least enough of them that a very large part of the usagestats table has to be scanned to find the first can't-possibly-match row. I'm not sure there's anything much we can do to improve this situation in Postgres. It seems like a sufficiently bizarre corner case that it wouldn't be appropriate to spend planner cycles checking for it, and I'm not sure how we'd check for it even if we were willing to spend those cycles. You might consider altering the query, or inserting some kind of dummy sentinel row in the data, or changing the schema (is it really sensible to keep vehicle usagestats in the same table as other usagestats?). A brute-force fix would be "enable_mergejoin = off", but that would prevent selecting this plan type even when it actually is a significant win. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance