Re: BBU Cache vs. spindles

Aidan Van Dyk <aidan@xxxxxxxxxxx> · Fri, 29 Oct 2010 11:56:09 -0400

On Fri, Oct 29, 2010 at 11:43 AM, Robert Haas <robertmhaas@xxxxxxxxx> wrote:

> Well, we COULD keep the data in shared buffers, and then copy it into
> an mmap()'d region rather than calling write(), but I'm not sure
> there's any advantage to it.  Managing address space mappings is a
> pain in the butt.

I could see this being a *theoretical* benefit in the case that the
background writer gains the ability to write out all blocks associated
with a file in order.  In that case, you might get a win because you
could get a single mmap of the entire file, and just wholesale memcpy
blocks across, then sync/unmap it.

This, of course assumes a few things that must be for it to be per formant:
0) a list of blocks to be written grouped by files is readily available.
1) The pages you write to must be in the page cache, or your memcpy is
going to fault them in.  With a plain write, you don't need the
over-written page in the cache.
2) Now, instead of the torn-page problem being FS block/sector sized
base, you can now actually have a possibly arbitrary amount of the
block memory written when the kernel writes out the page.  you
*really* need full-page-writes.
3) The mmap overhead required for the kernel to setup the mappings is
less than the repeated syscalls of a simple write().

All those things seem like something that somebody could synthetically
benchmark to prove value before even trying to bolt into PostgreSQL.

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@xxxxxxxxxxx                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance