Shaun Thomas <sthomas@xxxxxxxxxxxxxxxx> writes: > But I just noticed the lag in your response. :) It turns out, even > though I was substituting 2012-10-24 or 2012-10-25, what I really meant > was current_date. That does make all the difference, actually. Ah. [ pokes at that for awhile... ] OK, this has nothing to do with functional indexes, and everything to do with the edge-case behavior of scalarltsel. What you've got is a histogram whose last entry (corresponding to the highest observed value of the date) is current_date, and the question is what we should assume when estimating how many rows have a value >= that. The answer of course is "one, plus any duplicates" ... but we don't know how many duplicates there are, and what we do know is it's not a particularly large number because the value isn't present in the most-common-values stats. So the code there assumes there aren't any dups. Once you have enough histogram resolution for current_date to show up as the next-to-last as well as the last histogram entry, then of course the estimate gets a lot better, since we can now tell that there's at least one histogram bin's worth of duplicates. Interestingly, this is a case where the get_actual_variable_range patch (commit 40608e7f, which appeared in 9.0) makes the results worse. Before that, there was a (very arbitrary) lower bound on what we'd believe as the selectivity of a >= condition, but now, when we know the actual upper limit of the variable, we don't clamp the result that way. I think the clamp must have been saving you in your previous version, because it more-or-less-accidentally accounted for the end value not being unique. So the bottom line is that this is a case where you need a lot of resolution in the histogram. I'm not sure there's anything good we can do to avoid that. I spent a bit of time thinking about whether we could use n_distinct to get some idea of how many duplicates there might be for the endpoint value, but n_distinct is unreliable enough that I can't develop a lot of faith in such a thing. Or we could just arbitarily assume some fraction-of-a-histogram-bin's worth of duplicates, but that would make the results worse for some people. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance