Re: Replaying 48 WAL files takes 80 minutes

"ktm@xxxxxxxx" <ktm@xxxxxxxx> · Tue, 30 Oct 2012 08:05:33 -0500

On Tue, Oct 30, 2012 at 09:50:44AM +0100, Albe Laurenz wrote:
> >> On Mon, Oct 29, 2012 at 6:05 AM, Albe Laurenz
> <laurenz.albe@xxxxxxxxxx> wrote:
> >>> I am configuring streaming replication with hot standby
> >>> with PostgreSQL 9.1.3 on RHEL 6 (kernel 2.6.32-220.el6.x86_64).
> >>> PostgreSQL was compiled from source.
> >>>
> >>> It works fine, except that starting the standby took for ever:
> >>> it took the system more than 80 minutes to replay 48 WAL files
> >>> and connect to the primary.
> >>>
> >>> Can anybody think of an explanation why it takes that long?
> 
> Jeff Janes wrote:
> >> Could the slow log files be replaying into randomly scattered pages
> >> which are not yet in RAM?
> >>
> >> Do you have sar or vmstat reports?
> 
> The sar reports from the time in question tell me that I read
> about 350 MB/s and wrote less than 0.2 MB/s.  The disks were
> fairly busy (around 90%).
> 
> Jeff Trout wrote:
> > If you do not have good random io performance log replay is nearly
> unbearable.
> > 
> > also, what io scheduler are you using? if it is cfq change that to
> deadline or noop.
> > that can make a huge difference.
> 
> We use the noop scheduler.
> As I said, an identical system performed well in load tests.
> 
> The sar reports give credit to Jeff Janes' theory.
> Why does WAL replay read much more than it writes?
> I thought that pretty much every block read during WAL
> replay would also get dirtied and hence written out.
> 
> I wonder why the performance is good in the first few seconds.
> Why should exactly the pages that I need in the beginning
> happen to be in cache?
> 
> And finally: are the numbers I observe (replay 48 files in 80
> minutes) ok or is this terribly slow as it seems to me?
> 
> Yours,
> Laurenz Albe
> 

Hi,

The load tests probably had the "important" data already cached. Processing
a WAL file would involve bringing all the data back into memory using a
random I/O pattern. Perhaps priming the file cache using some sequential
reads would allow the random I/O to hit memory instead of disk. I may be
misremembering, but wasn't there an associated project/program that would
parse the WAL files and generate cache priming reads?

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance