On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > "Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx> writes: >> Robert Haas <robertmhaas@xxxxxxxxx> wrote: >>> Unfortunately, to know how much data we're going to grovel >>> through, we need to know the plan; and to decide on the right >>> plan, we need to know how much data we're going to grovel through. > >> And that's where they've been ending. > >> The only half-sane answer I've thought of is to apply a different >> cost to full-table or full-index scans based on the ratio with >> effective cache size. Kevin, yes, good point. Bravo! Let's do that. Details TBD, but suppose effective_cache_size = 1GB. What we know for sure is that a 4 GB table is not going to be fully cached but a 4 MB table may well be. In fact, I think we should assume that the 4 MB table IS cached, because the point is that if it's used at all, it soon will be. It's almost certainly a bad idea to build a plan around the idea of minimizing reads from that 4 MB table in favor of doing a substantial amount of additional work somewhere else. I suppose this could break down if you had hundreds and hundreds of 4 MB tables all of which were accessed regularly, but that's an unusual situation, and anyway it's not clear that assuming them all uncached is going to be any better than assuming them all cached. > This might have some connection to some rather half-baked ideas I've > been having in connection with the generalized-inner-indexscan problem. > I don't have anything in the way of a coherent presentation to make yet, > but the thing I'm being forced to realize is that sane modeling of a > complex subplan that's on the inside of a nestloop join requires > treating *every* scan type as having different costs "the first time" > versus "during rescan". If the total amount of data touched in the > query is less than effective_cache_size, it's not unreasonable to > suppose that I/O costs during rescan might be zero, even for a seqscan or > a non-parameterized indexscan. In fact, only parameterized indexscans > would be able to touch pages they'd not touched the first time, and so > they ought to have higher not lower rescan costs in this environment. > But once the total data volume exceeds effective_cache_size, you have to > do something different since you shouldn't any longer assume the data is > all cached from the first scan. (This isn't quite as hard as the case > you're talking about, since I think the relevant data volume is the sum > of the sizes of the tables used in the query; which is easy to > estimate at the start of planning, unlike the portion of the tables > that actually gets touched.) Well, we don't want the costing model to have sharp edges. effective_cache_size can't be taken as much more than an educated guess, and what actually happens will depend a lot on what else is going on on the system. If only one query is running on a system at a time and it is repeatedly seq-scanning a large table, the cost of reading pages in will be very small until the table grows large enough that you can't fit the whole thing in memory at once, and then will abruptly go through the roof. But realistically you're not going to know exactly where that edge is going to be, because you can't predict exactly how much concurrent activity there will be, for example, or how much work_mem allocations will push out of the OS buffer cache. So I'm thinking we should start the costs at something like 0.05/0.05 for tables that are much smaller than effective_cache_size and ramp up to 4/1 for tables that are larger than effective_cache_size. Maybe just by linearly ramping up, although that has a certain feeling of being without mathemetical soundness. > An idea that isn't even half-baked yet is that once we had a cost model > like that, we might be able to produce plans that are well-tuned for a > heavily cached environment by applying the "rescan" cost model even to > the first scan for a particular query. So that might lead to some sort > of "assume_cached" GUC parameter, and perhaps Kevin could tune his > reporting queries by turning that off instead of messing with individual > cost constants. I think the real goal here should be to try to avoid needing a GUC. A lot of people could benefit if the system could make some attempt to recognize on its own which queries are likely to be cached. We already have parameters you can hand-tune for each query as necessary. Being able to set some parameters system-wide and then get sensible behavior automatically would be much nicer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance