postgres wal sender replication timeout during pg_basebackup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all,
   I had posted this to dba.stackexchange but haven't gotten any responses, so I thought the list here may be more focused and have a better shot to post this.

   I'll start by noting that I am still somewhat green with Postgres...  One of our applications requires it, so I have been learning as I go...

   Right now I am working on a postgres 9.2 Active/Standby cluster on Debian wheezy to make the application more fault tolerent, based off of the ClusterLabs pgsql cluster documentation [http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster].

   In the lab, I am able to get this setup and working without a problem; But on the pre-production cluster, I keep running into a wal sync error.

   I brought the database files over from the current single production postgres server. By this I mean I shutdown postgres and tar-ed up the data directory and copied it over the the cluster's Master node. I put the files in place, set the permissions, and was able to start-up postgres on the Master via corosync just fine.

   In preparing the slave, I used the pg_basebackup tool to bring the database over from the Master and this is where I keep having issues. As it is transferring, at about 57% I see the error:

>  $ pg_basebackup -h db-master -U u_repl -D /db/data/postgresql/9.2/main/ -X stream -P
>  pg_basebackup: could not receive data from WAL stream: SSL connection has been closed unexpectedly
>  176472/176472 kB (100%), 1/1 tablespace
>  pg_basebackup: child process exited with error 1`

And on the server, I see:

>  2016-04-06 21:05:31 UTC LOG:  terminating walsender process due to replication timeout

  But the transfer doesn't stop and keeps going to completion.

  I found this [http://dba.stackexchange.com/questions/59916/streaming-replication-log-is-puzzling-me] question on stackexchange about setting "ssl_renegotiation_limit" to 0, but this didn't make much difference.

  Anyone have any ideas? I didn't find any reference to this problem in the mailing list archives. I am completely baffled as to why this would error, but keep on going. Maybe this isn't a problem at all?  It is the same procedure I used in the lab setup... the only difference is that the production database is much bigger in size.

Any thoughts??


-With kind regards, Peter.


-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux