The Bucardo check_postgres module contains a hot_standby_delay check function which will calculate the delta between the xlog position of the master with the slave(s). http://bucardo.org/check_postgres/check_postgres.pl.html#hot_standby_delay -----Original Message----- From: pgsql-admin-owner@xxxxxxxxxxxxxx [mailto:pgsql-admin-owner@xxxxxxxxxxxxxx] On Behalf Of Steve Crawford Sent: Wednesday, November 28, 2012 1:02 PM To: pgsql-admin@xxxxxxxxxxxxxx Subject: Re: Fwd: Monitoring Replication on Master/Slave Postgres(9.1) On 11/28/2012 10:21 AM, Shams Khan wrote: > ...how do we ensure my replication is working fine?... > Below is the core of one of my bash-script tools. It could use some tweaking (comments welcome) but works well. The script is run every minute by cron on master and standby servers. It auto-determines whether the server is currently a master or standby so the same script can be deployed to all servers. If a master-server, it updates a one-record test table with a current timestamp to ensure there is activity on the master. If a standby-server, it determines the lag based both on the age of pg_last_xact_replay_timestamp() and on the age of the record in the test table then returns the worst of the two. The delay value is set in $standby_delay which is a value in seconds. It's up to you to decide what constitutes an issue that requires attention (but remember that 60-seconds does not necessarily indicate a problem on an idle server). My first-level alert triggers at 130-seconds and I have never hit that much of a delay. #!/bin/bash # # Check PostgreSQL sync-status # # Requires table "sync_status" with column "sync_time" of type timestamp with time zone # We need a temp file tempquery="$(mktemp /tmp/monitor_db_synchronizationXXXXXXXXXX)" # If master, update sync_status timestamp and return 0. If standby, check both age # of log-replay location and of timestamp in sync_status table and set $standby_delay # to the greater of the two (in seconds) # standby_delay=$( psql -q --tuples-only --no-align 2>/dev/null <<EOS \o ${tempquery} select case when setting='on' then ' with logdelay as ( select case when pg_last_xlog_receive_location()=pg_last_xlog_replay_location() then 0::int else (extract(epoch from now())-extract(epoch from pg_last_xact_replay_timestamp()))::int end as replicadelay union select (extract(epoch from now())-extract(epoch from sync_time))::int as replicadelay from sync_status ) select max(replicadelay) from logdelay ; ' else ' begin; delete from sync_status; insert into sync_status (sync_time) values (now()) returning 0::int as replicadelay; commit; ' end from pg_settings where name='transaction_read_only'; \o \i ${tempquery} EOS ) # Cleanup temp file test -f "${tempquery}" && rm "${tempquery}" # Do some alert based on the number of seconds of lag between master and standby here Cheers, Steve -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin