Re: Fwd: Monitoring Replication on Master/Slave Postgres(9.1)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the response Steve...It was really helpful:

Below are some doubts I wanted to clarify..please read and suggest.
Can we also check if replication was broken earlier...somehow due to power failure of some other reasons in past...The reason I am asking is I am getting some discrepancies in data between master and slave...? I want to know the reason of it...

thanx

On Thu, Nov 29, 2012 at 12:32 AM, Steve Crawford <scrawford@xxxxxxxxxxxxxxxxxxxx> wrote:
On 11/28/2012 10:21 AM, Shams Khan wrote:
...how do we ensure my replication is working fine?...


Below is the core of one of my bash-script tools. It could use some tweaking (comments welcome) but works well. The script is run every minute by cron on master and standby servers. It auto-determines whether the server is currently a master or standby so the same script can be deployed to all servers.

If a master-server, it updates a one-record test table with a current timestamp to ensure there is activity on the master.

If a standby-server, it determines the lag based both on the age of pg_last_xact_replay_timestamp() and on the age of the record in the test table then returns the worst of the two.

The delay value is set in $standby_delay which is a value in seconds. It's up to you to decide what constitutes an issue that requires attention (but remember that 60-seconds does not necessarily indicate a problem on an idle server). My first-level alert triggers at 130-seconds and I have never hit that much of a delay.

#!/bin/bash
#
# Check PostgreSQL sync-status
#
# Requires table "sync_status" with column "sync_time" of type timestamp with time zone


# We need a temp file
tempquery="$(mktemp /tmp/monitor_db_synchronizationXXXXXXXXXX)"

# If master, update sync_status timestamp and return 0. If standby, check both age
# of log-replay location and of timestamp in sync_status table and set $standby_delay
# to the greater of the two (in seconds)
#
standby_delay=$(
psql -q --tuples-only --no-align 2>/dev/null <<EOS
\o ${tempquery}
select
case when setting='on' then
'
with logdelay as
    (
    select
    case when
pg_last_xlog_receive_location()=pg_last_xlog_replay_location() then 0::int
    else
        (extract(epoch from now())-extract(epoch from pg_last_xact_replay_timestamp()))::int
    end as replicadelay
    union
    select
        (extract(epoch from now())-extract(epoch from sync_time))::int as replicadelay
    from
        sync_status
    )
select
    max(replicadelay)
from
    logdelay
;
'
else
'
begin;
delete from sync_status;
insert into sync_status (sync_time) values (now()) returning 0::int as replicadelay;
commit;
'
end
from pg_settings where name='transaction_read_only';
\o
\i ${tempquery}
EOS
)

# Cleanup temp file
test -f "${tempquery}" && rm "${tempquery}"

# Do some alert based on the number of seconds of lag between master and standby here

Cheers,
Steve



--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux