Re: Streaming Replication: Observations, Questions and Comments

Greg Smith <greg@xxxxxxxxxxxxxxx> · Wed, 24 Aug 2011 12:30:06 -0400

On 08/24/2011 11:33 AM, Samba wrote:

  One strange thing I noticed is that
the pg_xlogs on the master have outsized the actual data stored in the
database by at least 3-4 times, which was quite surprising. I'm not
sure if 'restore_command' has anything to do with it. I did not
understand why transaction logs would need to be so many times larger
than the actual size of the database, have I done something wrong
somewhere? 

That's common to see.  Systems that regularly UPDATE the same rows
often can easily end up with a WAL stream much larger than the
database.  The WAL data contains enough information to replay every
point in time from the base backup until the current time.  That can be
significantly larger than the database, which just holds the latest
copy of the data.

One of the biggest things that makes your WAL large are the full page
writes that protect against incomplete writes.  See "question regarding
full_page_writes" thread happening on this list recently for details. 
Each time you touch a page, per checkpoint, another full copy of that
page is written out.

What I have to do in a lot of cases is significantly decrease the
number of checkpoints in order to keep this overhead under control. 
The default config has a checkpoint every checkpoint_segments of work,
and every checkpoint_timeout of time.  That makes for a checkpoint
every 5 minutes, and even more often under heavy load.

If you increase checkpoint_segments a whole lot, all of your
checkpoints will be based on the timeout instead.  Then you can see how
WAL load decreases as you increase checkpoint_timeout.  I've had to set
checkpoint_timeout as high as 30 minutes before on busy systems, to
lower the WAL overhead.

-- 
Greg Smith   2ndQuadrant US    greg@xxxxxxxxxxxxxxx   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us