On 7 March 2016 at 16:02, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > In English, what that plan is trying to do is scan the index > in sc_id order until it hits a row with scdate in the target range. > The first such row, by definition, has the correct min(sc_id) value. > The problem is that we're guessing at how soon we'll hit such a row. > If the columns are independent, then the planner can guess based on how > many rows in the whole table have scdate in the target range, and it > will probably be about right. But that estimate can fall down very > badly if sc_id and scdate increase together, because then the target > rows aren't randomly distributed in the index sequence but could all be > all the way at the far end of the index. I'm sorry, I'm obviously not being clear. I already accepted this argument when Victor gave it, although I believe that in part it falls down because sc_id is also (potentially) randomly distributed so it's not like you're doing a sequential table scan (it might work better on a clustered table, but we don't have those :) ) So you still have an extra layer of indirection into a large table with lots of random accesses. > If we had cross-column correlation stats we could detect this pitfall, > but without that it's hard to do. But as far as I can see, apart from the absolute extremes, the index-only scan is _always_ going to be quicker than the index+table scan. It doesn't matter whether or not the distribution is random or skewed, the index-only scan is going to be better (or approximately equally as good). We can see that by the massive speedup I get by using index(scid,scdate), which in all other respects is going to suffer from exactly the same problem from that the scid-only index suffers. And the real advantage: at the extremes, the index-only worst-case is minimally worse than the best case. Whereas the worst-case of the index-scan-plus-table-compare method is horrific. I don't believe you need any further statistics than what is currently available to be able to make that judgement, and that's why I believe it's suboptimal. Geoff -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general