Re: shared_buffers advice

Merlin Moncure <mmoncure@xxxxxxxxx> · Mon, 24 May 2010 14:25:58 -0400

On Wed, Mar 10, 2010 at 9:28 PM, Paul McGarry <paul@xxxxxxxxxxxxxxx> wrote:
> Hi there,
>
> I'm after a little bit of advice on the shared_buffers setting (I have
> read the various docs on/linked from the performance tuning wiki page,
> some very helpful stuff there so thanks to those people).
>
> I am setting up a 64bit Linux server running Postgresql 8.3, the
> server has 64gigs of memory and Postgres is the only major application
> running on it. (This server is to go alongside some existing 8.3
> servers, we will look at 8.4/9 migration later)
>
> I'm basically wondering how the postgresql cache (ie shared_buffers)
> and the OS page_cache interact. The general advice seems to be to
> assign 1/4 of RAM to shared buffers.
>
> I don't have a good knowledge of the internals but I'm wondering if
> this will effectively mean that roughly the same amount of RAM being
> used for the OS page cache will be used for redundantly caching
> something the Postgres is caching as well?
>
> IE when Postgres reads something from disk it will go into both the OS
> page cache and the Postgresql shared_buffers and the OS page cache
> copy is unlikely to be useful for anything.
>
> If that is the case what are the downsides to having less overlap
> between the caches, IE heavily favouring one or the other, such as
> allocating shared_buffers to a much larger percentage (such as 90-95%
> of expected 'free' memory).

I've personally heard tons of anecdotal evidence wrt shared buffers
setting.  There is a bit of benchmarking info suggesting you can eek
marginal gains via shared buffers setting but you have to take
(unfortunately) o/s, hardware, filesystem and other factors all into
account.

Here is what I'm pretty confident about saying:
*) a page fault to disk is a much bigger deal than a fault to pg cache
vs os/ cache.

many people assume that raising shared buffers decreases the chance of
a disk fault.  it doesn't -- at least not in the simple way you would
think -- all modern o/s aggressively cache filesystem data already so
we are simply layering over the o/s cache.

If your database is really big -- anything that reduces disk faults is
a win and increases them is a loss.  tps measurements according to
pgbench are not as interesting to me as iops from the disk system.

*) shared buffer affects are hard to detect in the single user case.

The performance of a single 'non disk bound' large query will perform
pretty much the same regardless of how you set shared buffers.  In
other words, you will not be able to easily measure the differences in
the setting outside of a real or simulated production workload.

*) shared_buffers is one of the _least_ important performance settings
in postgresql.conf

Many settings, like work_mem, planner tweaks, commit settings,
autovacuum settings, can dramatically impact your workload performance
in spectacular ways, but tend to be 'case by case' specific.  shared
buffers affects _everything_, albeit in very subtle ways, so you have
to be careful.

*) I sometimes wonder if the o/s should just manage everything.

we just said goodbye to the fsm (thank goodness for that!) -- what
about a fully o/s managed cache?  goodbye svsv ipc?  note my views
here are very simplistic -- I don't have anything close to a full
understanding of the cache machinery in the database.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance