On 12/15/2014 11:12 AM, Joseph Kregloh wrote:
Hello,
I have a master multi slave streaming replication setup. One master and
two slaves. I need to do some maintenance on one of the slaves as one of
the drives died however there is some other weird things going on in
that array that I would need to investigate. So I am expecting the
machine to be down at least two hours.
I remember reading that if a master cannot connect to the slave it would
hold the log file from shipping. Is there any other way to hold the file
until the slave comes back online? Would it affect both slaves not
getting their files shipped over?
The good thing is that the slave in question is not serving any connections.
From what I remember emptying out the archive_command would pause log
shipping. Can the same be done by issuing a pg_stop_backup()?
Thanks,
-Joseph Kregloh
I kinda turn mine around, so to speak.
My master (web1) PG has this:
archive_command = '/usr/local/bin/pg_arch.sh "%p" "%f"'
/usr/local/bin/pg_arch.sh:
---------------
#!/bin/bash
# pg_arch.sh "%p" "%f"
archive='/pub/pgarchive'
set -e
if [ ! -f $archive/$2 ]
then
/usr/bin/cp $1 $archive/webserv/$2
/usr/bin/ln $archive/webserv/$2 $archive/web2/$2
fi
exit 0
---------------
I have one master (web1) and two slaves (web2 and webserv)
This always copies, and always returns 0. (Note the use of ln so extra
disk space isn't wasted).
At this point I only collect wall, this script never removes it. One
slave is very close, so it gets updated quickly. The other is very far,
and only updates at night when I can copy for less $$.
It doesnt really matter how I get the two slaves updated (the close one,
actually, uses steaming. The far one rsync, but that's besides the point)
The clean up happens in reverse. I have a perl cron job that runs every
half hour. It connects to master (web1) and each slave, and runs
something like:
$db is a slave, $master is the master.
$q = $db->prepare("SELECT pg_last_xlog_replay_location()");
$q->execute();
my ($webrp) = $q->fetchrow_array();
$q = undef;
$db->disconnect();
$q = $master->prepare("select file_name from pg_xlogfile_name_offset(
'$webrp' )");
$q->execute();
my ($web) = $q->fetchrow_array();
$q = undef;
system("/usr/bin/sudo -u postgres /usr/local/pgsq/bin/pg_archivecleanup
/pub/pgarchive/web2 $web")
What we do is have the master query the slave's location and then run
pg_archivecleanup. This way if we loose communication I wont clean up
files. And I wont clean up WAL until the slave has actually applied it.
And each slave is independent, so I can take one down and the master
will just keep collecting. As soon as I bring the slave back up, we get
a response and starting cleaning again.
-Andy
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general