On Wed, Nov 10, 2010 at 10:15 AM, Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx> wrote: > But wait -- it turns out that this pain was self-inflicted. Based > on heavy testing of the interactive queries which users run against > this database we tuned the database for "fully-cached" settings, > with both random_page_cost and _seq_page_cost at 0.1. In a > practical sense, the users are almost always running these queries > against very recent data which is, in fact, heavily cached -- so > it's no surprise that the queries they run perform best with plans > based on such costing. The problem is that these weekly maintenance > runs need to pass the entire database, so caching effects are far > less pronounced. If I set seq_page_cost = 1 and random_page_cost = > 2 I get exactly the same (fast) plan as above. > > I guess the lesson here is not to use the same costing for > database-wide off-hours maintenance queries as for ad hoc queries > against a smaller set of recent data by users who expect quick > response time. I'm fine with tweaking the costs in our maintenance > scripts, but it does tend to make me daydream about how the > optimizer might possibly auto-tweak such things.... Wow. That's fascinating, and if you don't mind, I might mention this potential problem in a future talk at some point. I've given some thought in the past to trying to maintain some model of which parts of the database are likely to be cached, and trying to adjust costing estimates based on that data. But it's a really hard problem, because what is and is not in cache can change relatively quickly, and you don't want to have too much plan instability. Also, for many workloads, you'd need to have pretty fine-grained statistics to figure out anything useful, which would be expensive and difficult to maintain. But thinking over what you've written here, I'm reminded of something Peter said years ago, also about the optimizer. He was discussed the ratio of the estimated cost to the actual cost and made an off-hand remark that efforts had been made over the years to make that ratio more consistent (i.e. improve the quality of the cost estimates) but that they'd been abandoned because they didn't necessarily produce better plans. Applying that line of thinking to this problem, maybe we should give up on trying to make the estimates truly model reality, and focus more on assigning them values which work well in practice. For example, in your case, it would be sufficient to estimate the amount of data that a given query is going to grovel through and then applying some heuristic to choose values for random_page_cost and seq_page_cost based on the ratio of that value to, I don't know, effective_cache_size. Unfortunately, to know how much data we're going to grovel through, we need to know the plan; and to decide on the right plan, we need to know how much data we're going to grovel through. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance