Robert Haas wrote: > On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > > Robert Haas <robertmhaas@xxxxxxxxx> writes: > >> Yeah. ?For Kevin's case, it seems like we want the caching percentage > >> to vary not so much based on which table we're hitting at the moment > >> but on how much of it we're actually reading. > > > > Well, we could certainly take the expected number of pages to read and > > compare that to effective_cache_size. ?The thing that's missing in that > > equation is how much other stuff is competing for cache space. ?I've > > tried to avoid having the planner need to know the total size of the > > database cluster, but it's kind of hard to avoid that if you want to > > model this honestly. > > I'm not sure I agree with that. I mean, you could easily have a > database that is much larger than effective_cache_size, but only that > much of it is hot. Or, the hot portion could move around over time. > And for reasons of both technical complexity and plan stability, I > don't think we want to try to model that. It seems perfectly > reasonable to say that reading 25% of effective_cache_size will be > more expensive *per-page* than reading 5% of effective_cache_size, > independently of what the total cluster size is. Late reply, but one idea is to have the executor store hit counts for later use by the optimizer. Only the executor knows how many pages it had to request from the kernel for a query. Perhaps getrusage could tell us how often we hit the disk. -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance