Ah, that makes sense. I think I'll add some logic to the script that has it get new data points if it comes up with a negative value.
Thanks for the insight.QH
On Mon, Apr 22, 2013 at 5:11 PM, Andres Freund <andres@xxxxxxxxxxxxxxx> wrote:
Well, between the time pg_current_xlog_location() is run on the primaryOn 2013-04-22 16:36:38 -0600, Quentin Hartman wrote:
> I'm using this script to check my replication lag on my streaming
> replication pairs with Nagios:
>
> https://gist.github.com/jacobian/743942
>
> It generally works fine, but will occasionally return a negative lag value
> (-37kb for example) which of course causes it to throw an alarm, but is
> total nonsense. I've been working on the assumption that it is some sort of
> bug in the script, but in taking a quick look at it nothing jumps out at me.
>
> Is there something in Postgres itself that could cause this to happen once
> in awhile? Is it something to be concerned about? Is there a better way to
> monitor this state?
and pg_last_xlog_replay_location() on the standby some time passes, so
its not all that unlikely that wal has been generated, streamed *and*
applied in that time. Given the short timeframe it only happens every
now and then.
Did you check the pg_stat_replication view on the primary?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services