Search Postgresql Archives

Re: Calculating Replication Lag - units

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/26/2012 08:16 AM, David Kerr wrote:
On 06/26/2012 05:11 AM, Stuart Bishop wrote:
On Tue, Jun 26, 2012 at 6:21 AM, David Kerr<dmk@xxxxxxxxxxxxxx>  wrote:
On Mon, Jun 25, 2012 at 02:17:22PM -0700, Steve Crawford wrote:
- On 06/25/2012 01:17 PM, David Kerr wrote:
->Howdy,
->
->When calculating Replication lag, I know that we have to compare the
->pg_current_xlog_location
->to pg_last_xlog_receive_location, etc. but what I'm trying to figure out
->is what are
->the units that I'm left with after the calculation.
->
->(i.e., does the xlog_location imply some time value?)
->
->Here's the output of the (slightly modified script)
->Master: 5003964876715
->Receive: 5003964876715
->Replay: 5003964765203
->
->receive.value 0
->apply.value 111512
->
->111512 isn't inherently useful to me on its own.
->
->Any tips?
->
- How about now()-pg_last_xact_replay_timestamp() (however this can be a
- large number if there have not been any recent transactions on the
- master). I suppose you could do something like:
-
- case when pg_last_xlog_receive_location() =
- pg_last_xlog_replay_location() then '0 seconds'::interval
- else now()-pg_last_xact_replay_timestamp() end as log_delay;

i don't know for sure that 111512 is a time value.. that's kind of
what i'm wondering. If i knew that it was like miliseconds or something
that would be helpful.

On the hot standby:

     SELECT now()-pg_last_xact_replay_timestamp() AS lag;

This gives you the lag time as a PostgreSQL interval.

(It also might give you a value if you run it on a database that is
not a hot standby if it started in recovery mode).

It seems difficult or impossible to calculate this on the master.



Ah, awesome. I don't need to calculate it on the master so that's perfect.

Thanks!


But beware. If your master server has any idle time you may be misled by the simple calculation. I was running a pgbench test on a replicated pair of machines. It finished this morning sometime so the lag delay shows 02:31:11.651118, a value that might set off alarm bells. That's why I used the case statement to force the interval to 0 if the replay is up-to-date.

I think it is still worthwhile to check pg_stat_replication on the master to make sure that it is still *sending* logs and perhaps cross-checking the current log position on the master with the replayed log location on the standby to see if they are reasonably close.

An additional verification check I've toyed with is to have a cron script on the master update a one-row one-column table that holds a timestamp and checking that timestamp on the standby(s) to double-check that it is not too-far out-of-date. (This would also force regular data delivery to the standby so that pg_last_xact_replay_timestamp() should not lag far behind on an otherwise idle server.)

Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux