On 2016-03-12 21:00:04 +0000, Geoff Winkless wrote: > On 12 March 2016 at 18:43, Peter J. Holzer <hjp-pgsql@xxxxxx> wrote: > > The question is what can be done to improve the situation. > > > > Tom thinks that correlation statistics would help. That seems plausible > > to me. [...] > > You claim that no statistics are needed. > > Well that's a bit confrontational. Sorry. Didn't want to sound confrontational. I was just repeating points made by Tom and you previously in this thread to establish a baseline. > > That may or may not be true: You haven't proposed an alternate method > > yet. > > You could make an assumption that perfect distribution isn't true: > that actually the distribution is within a certain _deviation_ of that > perfect distribution. It wouldn't have to have been very much to make > the index-only scan win here and would still keep the planner from > choosing less optimal queries most of the time (and where it did end > up making the "wrong" choice it's not going to be far off anyway). > > But I'm making assumptions here, I'm aware of that. Chances are that > actually most people's data _does_ fit into this perfect distribution > set. Is there any research that shows that real-world data usually > does? I don't think most people's data is perfectly distributed. But as you say most data is probably within some deviation of being perfectly distributed and as long as that deviation isn't too big it doesn't matter. But there are certainly some common examples of highly correlated columns. Having a serial id and a date as in your case is probably quite common. Another example might be a surrogate primary key which is computed from some other fields (e.g. a timeseries code starting with a country code, or a social security number starting with the birth date, ...). That's probably not that uncommon either. So, I agree with you. This is a problem and it should be fixed. I'm just sceptical that it can be done with a simple cost adjustment. > As Jeff points out I'd have a much larger win in this instance by > someone spending the time implementing skip index scans rather than > messing with the planner :) Yeah. I think I have some code which could benefit from this, too. I'll have to try that trick from the wiki. hp -- _ | Peter J. Holzer | I want to forget all about both belts and |_|_) | | suspenders; instead, I want to buy pants | | | hjp@xxxxxx | that actually fit. __/ | http://www.hjp.at/ | -- http://noncombatant.org/
Attachment:
signature.asc
Description: Digital signature