--- Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Personally I have no use for the idea of "force the planner to do > exactly X given a query of exactly Y". You don't have exactly Y > today, tomorrow, and the day after (if you do, you don't need a > hint mechanism at all, you need a mysql-style query cache). I don't agree here. I have "exactly Y" running millions of times daily. There's enough data that the statistics on specific values don't help all that much, even at the maximum statistics collection level. By "exactly Y" I mean the form of the query is identical, and the query plan is identical, since only the general statistics are being used for most executions of the query. The specific values vary, so caching is no help. In summary, I have a need to run "exactly Y" with query plan "exactly X". (detail in postscript) > IMHO most of the planner mistakes we see that could be fixed via > hinting are really statistical estimation errors, and so the right > level to be fixing them at is hints about how to estimate the number > of rows produced for given conditions. Do you mean something like "The selectivity of these two columns together is really X"? That would solve my specific problem. And the academic part of me likes the elegance of that solution. On the negative side, it means people must learn how the optimizer uses statistics (which I would never have done if I could have said "Use index X"). > Mind you that's still a plenty > hard problem, but you could at least hope that a hint of that form > would be useful for more than one query. Yes it would be useful for more than one query. I agree that it's the "right" level to hint at, in that it is at a higher level. Maybe the right level is not the best level though? In a business environment, you just want things to work, you don't want to analyze a problem all the way through and find the best, most general solution. As a former academic I understand the two points of view, and I don't think either is correct or wrong. Each view has its place. Since I work for a business now, my focus is on making quick fixes that keep the system running smoothly. Solving problems in the "right" way is not important. If the query slows down again later, we will examine the query plan and do whatever we have to do to fix it. It's not elegant, but it gives fast response times to the customers, and that's what matters. PS The case in question is a table with a 3-column primary key on (A, B, C). It also has an index on (B, C). Re-ordering the primary key doesn't help as I do lookups on A only as well. When I specify A, B and C (the primary key), the optimizer chooses the (B, C) index, on the assumption that specifying these two values will return only 1 row. But high correlation between B and C leads to 100s of rows being returned, and the query gets very slow. The quick fix is to say "Use index (A, B, C)". The statistics level fix would be to say "B and C really have high correlation".