Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

Simon Riggs <simon@xxxxxxxxxxxxxxx> · Mon, 01 Nov 2004 22:03:58 +0000

On Wed, 2004-10-27 at 01:39, Josh Berkus wrote:
> Thomas,
> 
> > As a result, I was intending to inflate the value of
> > effective_cache_size to closer to the amount of unused RAM on some of
> > the machines I admin (once I've verified that they all have a unified
> > buffer cache). Is that correct?
> 
> Currently, yes.  

I now believe the answer to that is "no, that is not fully correct",
following investigation into how to set that parameter correctly.

> Right now, e_c_s is used just to inform the planner and make 
> index vs. table scan and join order decisions.

Yes, I agree that is what e_c_s is used for.

...lets go deeper:

effective_cache_size is used to calculate the number of I/Os required to
index scan a table, which varies according to the size of the available
cache (whether this be OS cache or shared_buffers). The reason to do
this is because whether a table is in cache can make a very great
difference to access times; *small* tables tend to be the ones that vary
most significantly. PostgreSQL currently uses the Mackert and Lohman
[1989] equation to assess how much of a table is in cache in a blocked
DBMS with a finite cache. 

The Mackert and Lohman equation is accurate, as long as the parameter b
is reasonably accurately set. [I'm discussing only the current behaviour
here, not what it can or should or could be] If it is incorrectly set,
then the equation will give the wrong answer for small tables. The same
answer (i.e. same asymptotic behaviour) is returned for very large
tables, but they are the ones we didn't worry about anyway. Getting the
equation wrong means you will choose sub-optimal plans, potentially
reducing your performance considerably.

As I read it, effective_cache_size is equivalent to the parameter b,
defined as (p.3) "minimum buffer size dedicated to a given scan". M&L
they point out (p.3) "We...do not consider interactions of
multiple users sharing the buffer for multiple file accesses". 

Either way, M&L aren't talking about "the total size of the cache",
which we would interpret to mean shared_buffers + OS cache, in our
effort to not forget the beneficial effect of the OS cache. They use the
phrase "dedicated to a given scan"....

AFAICS "effective_cache_size" should be set to a value that reflects how
many other users of the cache there might be. If you know for certain
you're the only user, set it according to the existing advice. If you
know you aren't, then set it an appropriate factor lower. Setting that
accurately on a system wide basis may clearly be difficult and setting
it high will often be inappropriate.

The manual is not clear as to how to set effective_cache_size. Other
advice misses out the effect of the many scans/many tables issue and
will give the wrong answer for many calculations, and thus produce
incorrect plans for 8.0 (and earlier releases also).

This is something that needs to be documented rather than a bug fix.
It's a complex one, so I'll await all of your objections before I write
a new doc patch.

[Anyway, I do hope I've missed something somewhere in all that, though
I've read their paper twice now. Fairly accessible, but requires
interpretation to the PostgreSQL case. Mackert and Lohman [1989] "Index
Scans using a finite LRU buffer: A validated I/O model"]

> The problem which Simon is bringing up is part of a discussion about doing 
> *more* with the information supplied by e_c_s.    He points out that it's not 
> really related to the *real* probability of any particular table being 
> cached.   At least, if I'm reading him right.

Yes, that was how Jan originally meant to discuss it, but not what I meant.

Best regards,

Simon Riggs