On Wed, Nov 10, 2010 at 10:47:21PM -0500, Robert Haas wrote: > On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > > "Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx> writes: > >> Robert Haas <robertmhaas@xxxxxxxxx> wrote: > >>> Unfortunately, to know how much data we're going to grovel > >>> through, we need to know the plan; and to decide on the right > >>> plan, we need to know how much data we're going to grovel through. > > > >> And that's where they've been ending. > > > >> The only half-sane answer I've thought of is to apply a different > >> cost to full-table or full-index scans based on the ratio with > >> effective cache size. > > Kevin, yes, good point. Bravo! Let's do that. Details TBD, but > suppose effective_cache_size = 1GB. What we know for sure is that a 4 > GB table is not going to be fully cached but a 4 MB table may well be. > In fact, I think we should assume that the 4 MB table IS cached, > because the point is that if it's used at all, it soon will be. It's > almost certainly a bad idea to build a plan around the idea of > minimizing reads from that 4 MB table in favor of doing a substantial > amount of additional work somewhere else. I suppose this could break > down if you had hundreds and hundreds of 4 MB tables all of which were > accessed regularly, but that's an unusual situation, and anyway it's > not clear that assuming them all uncached is going to be any better > than assuming them all cached. > > > This might have some connection to some rather half-baked ideas I've > > been having in connection with the generalized-inner-indexscan problem. > > I don't have anything in the way of a coherent presentation to make yet, > > but the thing I'm being forced to realize is that sane modeling of a > > complex subplan that's on the inside of a nestloop join requires > > treating *every* scan type as having different costs "the first time" > > versus "during rescan". ?If the total amount of data touched in the > > query is less than effective_cache_size, it's not unreasonable to > > suppose that I/O costs during rescan might be zero, even for a seqscan or > > a non-parameterized indexscan. ?In fact, only parameterized indexscans > > would be able to touch pages they'd not touched the first time, and so > > they ought to have higher not lower rescan costs in this environment. > > But once the total data volume exceeds effective_cache_size, you have to > > do something different since you shouldn't any longer assume the data is > > all cached from the first scan. ?(This isn't quite as hard as the case > > you're talking about, since I think the relevant data volume is the sum > > of the sizes of the tables used in the query; which is easy to > > estimate at the start of planning, unlike the portion of the tables > > that actually gets touched.) > > Well, we don't want the costing model to have sharp edges. > effective_cache_size can't be taken as much more than an educated > guess, and what actually happens will depend a lot on what else is > going on on the system. If only one query is running on a system at a > time and it is repeatedly seq-scanning a large table, the cost of > reading pages in will be very small until the table grows large enough > that you can't fit the whole thing in memory at once, and then will > abruptly go through the roof. But realistically you're not going to > know exactly where that edge is going to be, because you can't predict > exactly how much concurrent activity there will be, for example, or > how much work_mem allocations will push out of the OS buffer cache. > So I'm thinking we should start the costs at something like 0.05/0.05 > for tables that are much smaller than effective_cache_size and ramp up > to 4/1 for tables that are larger than effective_cache_size. Maybe > just by linearly ramping up, although that has a certain feeling of > being without mathemetical soundness. > > > An idea that isn't even half-baked yet is that once we had a cost model > > like that, we might be able to produce plans that are well-tuned for a > > heavily cached environment by applying the "rescan" cost model even to > > the first scan for a particular query. ?So that might lead to some sort > > of "assume_cached" GUC parameter, and perhaps Kevin could tune his > > reporting queries by turning that off instead of messing with individual > > cost constants. > > I think the real goal here should be to try to avoid needing a GUC. A > lot of people could benefit if the system could make some attempt to > recognize on its own which queries are likely to be cached. We > already have parameters you can hand-tune for each query as necessary. > Being able to set some parameters system-wide and then get sensible > behavior automatically would be much nicer. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > I agree with the goal of avoiding the need for a GUC. This needs to be as automatic as possible. One idea I had had was computing a value for the amount of cache data in the system by keeping a sum or a weighted sum of the table usage in the system. Smaller tables and indexes would contribute a smaller amount to the total, while larger indexes and tables would contribute a larger amount. Then by comparing this running total to the effective_cache_size, set the random and sequential costs for a query. This would allow the case of many 4MB tables to favor disk I/O more than memory I/O. The weighting could be a function of simultaneous users of the table. I know this is a bit of hand-waving but some sort of dynamic feedback needs to be provided to the planning process as system use increases. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance