On Wed, 2004-10-27 at 01:39, Josh Berkus wrote: > Thomas, > > > As a result, I was intending to inflate the value of > > effective_cache_size to closer to the amount of unused RAM on some of > > the machines I admin (once I've verified that they all have a unified > > buffer cache). Is that correct? > > Currently, yes. I now believe the answer to that is "no, that is not fully correct", following investigation into how to set that parameter correctly. > Right now, e_c_s is used just to inform the planner and make > index vs. table scan and join order decisions. Yes, I agree that is what e_c_s is used for. ...lets go deeper: effective_cache_size is used to calculate the number of I/Os required to index scan a table, which varies according to the size of the available cache (whether this be OS cache or shared_buffers). The reason to do this is because whether a table is in cache can make a very great difference to access times; *small* tables tend to be the ones that vary most significantly. PostgreSQL currently uses the Mackert and Lohman [1989] equation to assess how much of a table is in cache in a blocked DBMS with a finite cache. The Mackert and Lohman equation is accurate, as long as the parameter b is reasonably accurately set. [I'm discussing only the current behaviour here, not what it can or should or could be] If it is incorrectly set, then the equation will give the wrong answer for small tables. The same answer (i.e. same asymptotic behaviour) is returned for very large tables, but they are the ones we didn't worry about anyway. Getting the equation wrong means you will choose sub-optimal plans, potentially reducing your performance considerably. As I read it, effective_cache_size is equivalent to the parameter b, defined as (p.3) "minimum buffer size dedicated to a given scan". M&L they point out (p.3) "We...do not consider interactions of multiple users sharing the buffer for multiple file accesses". Either way, M&L aren't talking about "the total size of the cache", which we would interpret to mean shared_buffers + OS cache, in our effort to not forget the beneficial effect of the OS cache. They use the phrase "dedicated to a given scan".... AFAICS "effective_cache_size" should be set to a value that reflects how many other users of the cache there might be. If you know for certain you're the only user, set it according to the existing advice. If you know you aren't, then set it an appropriate factor lower. Setting that accurately on a system wide basis may clearly be difficult and setting it high will often be inappropriate. The manual is not clear as to how to set effective_cache_size. Other advice misses out the effect of the many scans/many tables issue and will give the wrong answer for many calculations, and thus produce incorrect plans for 8.0 (and earlier releases also). This is something that needs to be documented rather than a bug fix. It's a complex one, so I'll await all of your objections before I write a new doc patch. [Anyway, I do hope I've missed something somewhere in all that, though I've read their paper twice now. Fairly accessible, but requires interpretation to the PostgreSQL case. Mackert and Lohman [1989] "Index Scans using a finite LRU buffer: A validated I/O model"] > The problem which Simon is bringing up is part of a discussion about doing > *more* with the information supplied by e_c_s. He points out that it's not > really related to the *real* probability of any particular table being > cached. At least, if I'm reading him right. Yes, that was how Jan originally meant to discuss it, but not what I meant. Best regards, Simon Riggs