OK thanks. One of the motivations for asking these questions is that we are investigating ways to implement automated node removal from a VIP pool. We would like to be able to have the VIP management software (a dumb load balancer currently) be able to query the health of a particular node directly and if that node reports back to the VIP manager that it is lagging to much, have the VIP manager take the node out of it’s pool of backend servers. Currently it appears I will have to query the other nodes in the cluster to determine the replication healthiness status of a particular node, and figure out a way to send that status back to the VIP manager in a way it can act on it. Any suggestions on how to accomplish that would be appreciated. Dennis > On Apr 20, 2015, at 7:25 AM, Craig Ringer <craig@xxxxxxxxxxxxxxx> wrote: > > On 16 April 2015 at 23:58, Dennis <dennisr@xxxxxxxx> wrote: >> I need some clarification on how to monitor BDR nodes. In particular determining replication lag. As an example, I have a two node cluster with nodes ‘A’ and ‘B’. I need to be able to look at node ‘B’ and determine if it is lagging behind node ‘A’, by interrogating node ‘B’ only. > > You can't, that doesn't really make sense - in BDR, or in regular > PostgreSQL streaming replication. > > For that to be possible, node 'B' would need some side-channel by > which it found out the current WAL insert position of node 'A'. Which > effectively means communicating in real time with node 'A'... so the > client might as well do it instead. We can't do this effectively on > the walsender stream without some kind of interrupt message that can > be priority-injected into the stream, and even then it wouldn't help > if the issue was packet loss causing connection issues, etc. > > If you're in a position where node 'B' can make direct libpq > non-replication connections to 'A' but the client can't, you could use > postgres_fdw to expose a view of node A's > pg_current_xlog_insert_location(), plus the pg_replication_slots and > pg_stat_replication views. That seems a bit of an odd situation to me, > though. > >> Because it is querying the pg_stat_replication table, I will need to run this query on node ‘A’ to check the lag on node ‘B’, is that true? > > Correct. I'll make the docs more explicit about that. > >> I need to be able run a query on node ‘B’ to determine if it node ‘B’ is behind. I am not sure the above query will work for that use case. > > It won't, and you really can't. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general