Re: Problems with Streaming Replication and PostgreSQL 9.2.2

JotaComm <jota.comm@xxxxxxxxx> · Mon, 1 Apr 2013 15:30:48 -0300

Hello, Lonni

I use rsync to transfer the files (pg_xlog) from master to slave.

rsync -Crap %p postgres@slaveaddress:/slavetransactionlogs/%f

In my recovery.conf (slave) I have the following statement:

restore_command = 'test -f $slavetransactionlogs/%f && cp $slavetransactionlogs/%f %p'

Best Regards

2013/4/1 Lonni J Friedman <netllama@xxxxxxxxx>

If you're not running pg_basebackup with the -x option, how are you

getting the WAL files onto the slave?

On Mon, Apr 1, 2013 at 11:19 AM, JotaComm <jota.comm@xxxxxxxxx> wrote:

> Hello, everybody

>

> I have one problem and I need some help.

>

> My environment: one master and one slave (PostgreSQL 9.2.2).

>

> My cluster has about 160GB and pg_basebackup to syncronize them (master and

> slave).

>

> The syntax is:

>

> pg_basebackup -h productionaddress -p productionport -U productionuser -D

> datadirectory -P -v

>

> My recovery.conf:

>

> standby_mode = 'on'

>

> primary_conninfo = 'host=productionaddress port=productionport

> user=productionuser'

>

> archive_cleanup_command = 'pg_archivecleanup /slave/transactionlogs %r'

>

> My postgresql.conf: (master)

>

> wal_level = hot_standby

>

> checkpoint_segments = 10

>

> archive_mode = on

>

> archive_command = 'rsync -Crap %p

> postgres@slaveaddress:/slave/transactionlogs/%f'

>

> max_wal_senders = 1

>

> wal_keep_segments = 50

>

> My postgresql.conf: (slave)

>

> checkpoint_segments = 10

>

> hot_standby = on

>

> In my slave (logs) I have the following erros: My first attempt

>

> 2013-03-07 15:58:21 BRT [11817]: [1-1] user=,db= LOG:  database system was

> interrupted; last known up at 2013-03-07 15:55:43 BRT

> 2013-03-07 15:58:21 BRT [11817]: [2-1] user=,db= LOG:  entering standby mode

> 2013-03-07 15:58:21 BRT [11818]: [1-1] user=,db= LOG:  streaming replication

> successfully connected to primary

> 2013-03-07 15:58:25 BRT [11817]: [3-1] user=,db= LOG:  consistent recovery

> state reached at 141/8FBB5F0

> 2013-03-07 15:58:25 BRT [11817]: [4-1] user=,db= LOG:  redo starts at

> 141/2251F90

> 2013-03-07 15:58:25 BRT [11817]: [5-1] user=,db= FATAL:  could not access

> status of transaction 30622931

> 2013-03-07 15:58:25 BRT [11817]: [6-1] user=,db= DETAIL:  Could not read

> from file "pg_clog/001D" at offset 49152: Success.

> 2013-03-07 15:58:25 BRT [11817]: [7-1] user=,db= CONTEXT:  xlog redo commit:

> 2013-03-07 15:55:40.673623-03

> 2013-03-07 15:58:25 BRT [11767]: [1-1] user=,db= LOG:  startup process (PID

> 11817) exited with exit code 1

> 2013-03-07 15:58:25 BRT [11767]: [2-1] user=,db= LOG:  terminating any other

> active server processes

>

> In my slave (logs) I have the following erros: My second attempt

>

> 2013-03-11 12:07:49 BRT [5862]: [1-1] user=,db= LOG:  database system was

> interrupted; last known up at 2013-03-11 12:06:31 BRT

> 2013-03-11 12:07:49 BRT [5862]: [2-1] user=,db= LOG:  entering standby mode

> 2013-03-11 12:07:49 BRT [5864]: [1-1] user=,db= LOG:  streaming replication

> successfully connected to primary

> 2013-03-11 12:07:53 BRT [5862]: [3-1] user=,db= LOG:  consistent recovery

> state reached at 168/816AE10

> 2013-03-11 12:07:53 BRT [5862]: [4-1] user=,db= LOG:  redo starts at

> 167/FEC3D828

> 2013-03-11 12:07:53 BRT [5862]: [5-1] user=,db= FATAL:  could not access

> status of transaction 36529670

> 2013-03-11 12:07:53 BRT [5862]: [6-1] user=,db= DETAIL:  Could not read from

> file "pg_clog/0022" at offset 212992: Success.

> 2013-03-11 12:07:53 BRT [5862]: [7-1] user=,db= CONTEXT:  xlog redo commit:

> 2013-03-11 12:05:35.069759-03

> 2013-03-11 12:07:53 BRT [5762]: [1-1] user=,db= LOG:  startup process (PID

> 5862) exited with exit code 1

> 2013-03-11 12:07:53 BRT [5762]: [2-1] user=,db= LOG:  terminating any other

> active server processes

>

> I had the same problem, but in different files (pg_clog):

>

> First attempt:

>

> 2013-03-07 15:58:25 BRT [11817]: [5-1] user=,db= FATAL:  could not access

> status of transaction 30622931

> 2013-03-07 15:58:25 BRT [11817]: [6-1] user=,db= DETAIL:  Could not read

> from file "pg_clog/001D" at offset 49152: Success.

>

> Second attempt:

>

> 2013-03-11 12:07:53 BRT [5862]: [5-1] user=,db= FATAL:  could not access

> status of transaction 36529670

> 2013-03-11 12:07:53 BRT [5862]: [6-1] user=,db= DETAIL:  Could not read from

> file "pg_clog/0022" at offset 212992: Success.

>

> When I started the synchronism at the first time (no data), it was OK. Now,

> my cluster has about 160GB and when the replication finishes I have these

> problems. I noticed that pg_clog/current_file is the problem.

>

> Any idea? Suggestions?

>

> Thank you.

>

> Best Regards

>

> João Paulo

> --

> JotaComm

> http://jotacomm.wordpress.com

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

L. Friedman                                    netllama@xxxxxxxxx

LlamaLand                       https://netllama.linux-sxs.org

-- 
JotaComm
http://jotacomm.wordpress.com