On Tue, Jun 26, 2012 at 09:13:44AM -0700, Steve Crawford wrote: - On 06/26/2012 08:16 AM, David Kerr wrote: - >On 06/26/2012 05:11 AM, Stuart Bishop wrote: - >>On Tue, Jun 26, 2012 at 6:21 AM, David Kerr<dmk@xxxxxxxxxxxxxx> wrote: - >>>On Mon, Jun 25, 2012 at 02:17:22PM -0700, Steve Crawford wrote: - >>>- On 06/25/2012 01:17 PM, David Kerr wrote: - >>>->Howdy, - >>>-> - >>>->When calculating Replication lag, I know that we have to compare the - >>>->pg_current_xlog_location - >>>->to pg_last_xlog_receive_location, etc. but what I'm trying to - >>>figure out - >>>->is what are - >>>->the units that I'm left with after the calculation. - >>>-> - >>>->(i.e., does the xlog_location imply some time value?) - >>>-> - >>>->Here's the output of the (slightly modified script) - >>>->Master: 5003964876715 - >>>->Receive: 5003964876715 - >>>->Replay: 5003964765203 - >>>-> - >>>->receive.value 0 - >>>->apply.value 111512 - >>>-> - >>>->111512 isn't inherently useful to me on its own. - >>>-> - >>>->Any tips? - >>>-> - >>>- How about now()-pg_last_xact_replay_timestamp() (however this can - >>>be a - >>>- large number if there have not been any recent transactions on the - >>>- master). I suppose you could do something like: - >>>- - >>>- case when pg_last_xlog_receive_location() = - >>>- pg_last_xlog_replay_location() then '0 seconds'::interval - >>>- else now()-pg_last_xact_replay_timestamp() end as log_delay; - >>> - >>>i don't know for sure that 111512 is a time value.. that's kind of - >>>what i'm wondering. If i knew that it was like miliseconds or something - >>>that would be helpful. - >> - >>On the hot standby: - >> - >> SELECT now()-pg_last_xact_replay_timestamp() AS lag; - >> - >>This gives you the lag time as a PostgreSQL interval. - >> - >>(It also might give you a value if you run it on a database that is - >>not a hot standby if it started in recovery mode). - >> - >>It seems difficult or impossible to calculate this on the master. - >> - >> - > - >Ah, awesome. I don't need to calculate it on the master so that's - >perfect. - > - >Thanks! - > - > - But beware. If your master server has any idle time you may be misled by - the simple calculation. I was running a pgbench test on a replicated - pair of machines. It finished this morning sometime so the lag delay - shows 02:31:11.651118, a value that might set off alarm bells. That's - why I used the case statement to force the interval to 0 if the replay - is up-to-date. - - I think it is still worthwhile to check pg_stat_replication on the - master to make sure that it is still *sending* logs and perhaps - cross-checking the current log position on the master with the replayed - log location on the standby to see if they are reasonably close. - - An additional verification check I've toyed with is to have a cron - script on the master update a one-row one-column table that holds a - timestamp and checking that timestamp on the standby(s) to double-check - that it is not too-far out-of-date. (This would also force regular data - delivery to the standby so that pg_last_xact_replay_timestamp() should - not lag far behind on an otherwise idle server.) - - Cheers, - Steve - I see, thanks. I was completely missing that you were using different functions than me and that was throwing me off. =) -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general