On Thu, Nov 11, 2010 at 1:23 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Robert Haas <robertmhaas@xxxxxxxxx> writes: >> Let's back up a moment and talk about what the overall goal is, here. >> Ideally, we would like PostgreSQL to have excellent performance at all >> times, under all circumstances, with minimal tuning. Therefore, we do >> NOT want to add variables that will, by design, need constant manual >> adjustment. That is why I suggested that Tom's idea of an >> assume_cached GUC is probably not what we really want to do. On the >> other hand, what I understand Mladen to be suggesting is something >> completely different. He's basically saying that, of course, he wants >> it to work out of the box most of the time, but since there are >> guaranteed to be cases where it doesn't, how about providing some >> knobs that aren't intended to be routinely twaddled but which are >> available in case of emergency? Bravo, I say! > > Um ... those are exactly the same thing. You're just making different > assumptions about how often you will need to twiddle the setting. > Neither assumption is based on any visible evidence, unfortunately. > > I was thinking of assume_cached as something that could be > set-and-forget most of the time, and you're entirely right to criticize > it on the grounds that maybe it wouldn't. But to support a proposal > that doesn't even exist yet on the grounds that it *would* be > set-and-forget seems a tad inconsistent. We can't make that judgment > without a whole lot more details than have been provided yet for any > idea in this thread. Well, maybe I misunderstood what you were proposing. I had the impression that you were proposing something that would *by design* require adjustment for each query, so evidently I missed the point. It seems to me that random_page_cost and seq_page_cost are pretty close to set-and-forget already. We don't have many reports of people needing to tune these values on a per-query basis; most people seem to just guesstimate a cluster-wide value and call it good. Refining the algorithm should only make things better. > I do think that something based around a settable-per-table caching > percentage might be a reasonable way to proceed. But the devil is in > the details, and we don't have those yet. I think one of the larger devils in the details is deciding how to estimate the assumed caching percentage when the user hasn't specified one. Frankly, I suspect that if we simply added a reloption called assumed_caching_percentage and made it default to zero, we would make a bunch of DBAs happy; they'd knock down seq_page_cost and random_page_cost enough to account for the general level of caching and then bump assumed_caching_percentage up for hot tables/indexes (or ones that they want to have become hot). I think we can do better than that, but the right formula isn't exactly obvious. I feel safe saying that if effective_cache_size=1GB and table_size=4MB, then we ought to take the table as fully cached. But it's far from clear what caching percentage we should assume when table_size=400MB, and it seems like the sort of thing that will lead to endless bikeshedding. There's probably no perfect answer, but I feel we can likely come up with something that is better than a constant (which would probably still be better than what we have now). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance