Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复： [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

Tomas Vondra <tv@xxxxxxxx> · Fri, 02 Mar 2012 02:13:36 +0100

On 28.2.2012 17:42, Claudio Freire wrote:
> On Tue, Feb 28, 2012 at 1:05 PM, Tomas Vondra <tv@xxxxxxxx> wrote:
>> On 28 Únor 2012, 15:24, Claudio Freire wrote:
>>> It speeds a lot more than the initial load of data.
>>>
>>> Assuming the database is read-only, but not the filesystem (ie: it's
>>> not a slave, in which case all this is moot, as you said, there are no
>>> writes on a slave). That is, assuming this is a read-only master, then
>>> read-only queries don't mean read-only filesystem. Bookkeeping tasks
>>> like updating catalog dbs, statistics tables, page cleanup, stuff like
>>> that can actually result in writes.
>>>
>>> Writes that go through the WAL and then the filesystem.
>>
>> I'm not sure what maintenance tasks you mean. Sure, there are tasks that
>> need to be performed after the load (stats, hint bits, updating system
>> catalogs etc.) but this may happen once right after the load and then
>> there's effectively zero write activity. Unless the database needs to
>> write temp files, but that contradicts the 'fits into RAM' assumption ...
> 
> AFAIK, stats need to be constantly updated.

Err, what kind of stats are we talking about? Statistics capturing
characteristics of the data or runtime stats? There's no point in
updating data stats (histograms, MCV, ...) for read-only data and
PostgreSQL doesn't do that.

Runtime stats OTOH are collected and written continuously, that's true.
But in most cases this is not a write-heavy task, and if it is then it's
recommended to place the pg_stat_tmp on ramdrive (it's usually just few
MBs, written repeatedly).

> Not sure about the rest.

AFAIK it's like this:

  updating catalog tables - no updates on read-only data

  updating statistics - data stats: no, runtime stats: yes

  page cleanup - no (just once after the load)

> And yes, it's quite possible to require temp files without a database
> that doesn't fit in memory, only big OLAP-style queries and small
> enough work_mem.

Right. I'm not exactly sure how I arrived to the crazy conclusion that
writing temp files somehow contradicts the 'fits into RAM' assumption.
That's clearly nonsense ...

> 
>> The writes are always carried out by the OS - except when dirty_ratio is
>> exceeded (but that's a different story) and WAL with direct I/O enabled.
>> The best way to allow merging the writes in shared buffers or page cache
>> is to set the checkpoint_segments / checkpoint_timeout high enough.
>> That way the transactions won't need to wait for writes to data files
>> (which is the part related to evictions of buffers from cache). And
>> read-only transactions won't need to wait at all because they don't need
>> to wait for fsync on WAL.
> 
> Exactly
> 
>>> In essence, what was required, to keep everything in RAM for as much
>>> as possible.
>>>
>>> It *does* in the same way affect buffer eviction - it makes eviction
>>> *very* quick, and re-population equally as quick, if everything fits
>>> into memory.
>>
>> No it doesn't. Only a write caused by a background process (due to full
>> shared buffers) means immediate eviction. A simple write (caused by a
>> checkpoint) does not evict the page from shared buffers. Not even a
>> background writer evicts a page from shared buffers, it merely marks them
>> as 'clean' and leaves them there. And all those writes happen on the
>> background, so the clients don't need to wait for them to complete (except
>> for xlog checkpoints).
> 
> So, we're saying the same.

Maybe. I still am not sure how fsync=off affects the eviction in your
opinion. I think it does not (or just very remotely) and you were saying
the opposite. IMHO the eviction of (dirty) buffers is either very fast
or slow, no matter what the fsync setting is.

> With all that, and enough RAM, it already does what was requested.
> 
> Maybe it would help to tune shared_buffers-to-os-cache ratio, and
> dirty_ratio to allow a big portion of RAM used for write caching (if
> there were enough writes which I doubt), but, in essence, un
> unmodified postgres installation with enough RAM to hold the whole DB
> + shared buffers in RAM should perform quite optimally.

Probably, for a read-write database that fits into memory. In case of a
read-only database I don't think this really matters because the main
issue there are temp files and if you can stuff them into page cache
then you can just increase the work_mem instead and you're golden.

Tomas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance