> Brian Herlihy <btherl@xxxxxxxxxxxx> writes: > > What would it take for hints to be added to postgres? > > A *whole lot* more thought and effort than has been expended on the > subject to date. > > Personally I have no use for the idea of "force the planner to do > exactly X given a query of exactly Y". You don't have exactly Y > today, tomorrow, and the day after (if you do, you don't need a > hint mechanism at all, you need a mysql-style query cache). > IMHO most of the planner mistakes we see that could be fixed via > hinting are really statistical estimation errors, and so the right > level to be fixing them at is hints about how to estimate the number > of rows produced for given conditions. Mind you that's still a plenty > hard problem, but you could at least hope that a hint of that form > would be useful for more than one query. > Do I understand correctly that you're suggesting it might not be a bad idea to allow users to provide statistics? Is this along the lines of "I'm loading a big table and touching every row of data, so I may as well collect some stats along the way" and "I know my data contains these statistical properties, but the analyzer wasn't able to figure that out (or maybe can't figure it out efficiently enough)"? While it seems like this would require more knowledge from the user (e.g. more about their data, how the planner works, and how it uses statistics) this would actually be helpful/required for those who really care about performance. I guess it's the difference between a tool advanced users can get long term benefit from, or a quick fix that will probably come back to bite you. I've been pleased with Postgres' thoughtful design; recently I've been doing some work with MySQL, and can't say I feel the same way. Also, I'm guessing this has already come up at some point, but what about allowing PG to do some stat collection during queries? If you're touching a lot of data (such as an import process) wouldn't it be more efficient (and perhaps more accurate) to collect stats then, rather than having to re-scan? It would be nice to be able to turn this on/off on a per query basis, seeing as it could have pretty negative impacts on OLTP performance... - Bucky