Matthew <matthew@xxxxxxxxxxx> writes: > On Thu, 6 Dec 2007, Tom Lane wrote: >> Hmm. IIRC, there are smarts in there about whether a mergejoin can >> terminate early because of disparate ranges of the two join variables. > Very cool. Would that be a planner cost estimate fix (so it avoids the > merge join), or a query execution fix (so it does the merge join on the > table subset)? Cost estimate fix. Basically what I'm thinking is that the startup cost attributed to a mergejoin ought to account for any rows that have to be skipped over before we reach the first join pair. In general this is hard to estimate, but for mergejoin it can be estimated using the same type of logic we already use at the other end. After looking at the code a bit, I'm realizing that there's actually a bug in there as of 8.3: mergejoinscansel() is expected to be able to derive numbers for either direction of scan, but if it's asked to compute numbers for a DESC-order scan, it looks for a pg_stats entry sorted with '>', which isn't gonna be there. It needs to know to look for an '<' histogram and switch the min/max. So the lack of symmetry here is causing an actual bug in logic that already exists. That makes the case for fixing this now a bit stronger ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match